CrawlJobs Logo

Senior Infrastructure Engineer - GenAI

India, Chennai · Job Posted April 16, 2026
Apply Position
Job Link Share

Job Description

We are seeking an experienced Senior Backend Engineer to design, develop, and maintain the infrastructure powering our generative AI applications. You will work closely with AI engineers, platform teams, and product stakeholders to build scalable, reliable backend systems that support AI model deployment, inference, and integration. This role combines traditional backend engineering expertise with cutting-edge AI infrastructure challenges to deliver robust solutions at enterprise scale.

Job Responsibility

  • Design and implement scalable backend services and APIs for generative AI applications using microservices architecture and cloud-native patterns
  • Build and maintain model serving infrastructure with load balancing, auto-scaling, caching, and failover capabilities for high-availability AI services
  • Deploy and orchestrate containerized AI workloads using Docker, Kubernetes, ECS, and OpenShift across development, staging, and production environments
  • Develop serverless AI functions using AWS Lambda, ECS Fargate, and other cloud services for scalable, cost-effective inference
  • Implement robust CI/CD pipelines for automated deployment of AI services, including model versioning and gradual rollout strategies
  • Create comprehensive monitoring, logging, and alerting systems for AI service performance, reliability, and cost optimization
  • Integrate with various LLM APIs (OpenAI, Anthropic, Google) and open-source models, implementing efficient batching and optimization techniques
  • Build data pipelines for training data preparation, model fine-tuning workflows, and real-time streaming capabilities
  • Ensure adherence to security best practices, including authentication, authorization, API rate limiting, and data encryption
  • Collaborate with AI researchers and product teams to translate AI capabilities into production-ready backend services

Requirements

  • Bachelor’s degree in computer science, Engineering, or related technical field, or equivalent practical experience
  • 4–6 years of experience in backend engineering with focus on scalable, production systems
  • 2+ years of hands-on experience with containerization, Kubernetes, and cloud infrastructure in production environments
  • Demonstrated experience with AI/ML model deployment and serving in production systems
  • Strong experience with backend development using Python, with familiarity in Go, Node.js, or Java for building scalable web services and APIs
  • Hands-on experience with containerization using Docker and orchestration platforms including Kubernetes, OpenShift, and AWS ECS in production environments
  • Proficient with cloud infrastructure, particularly AWS services (Lambda, ECS, EKS, S3, RDS, ElastiCache) and serverless architectures
  • Experience with CI/CD pipelines using Jenkins, GitLab CI, GitHub Actions, or similar tools, including Infrastructure as Code with Terraform or CloudFormation
  • Strong knowledge of databases including PostgreSQL, MongoDB, Redis, and experience with vector databases for AI applications
  • Familiarity with message queues (RabbitMQ, Apache Kafka, AWS SQS/SNS) and event-driven architectures
  • Experience with monitoring and observability tools such as Prometheus, Grafana, DataDog, or equivalent platforms
  • Knowledge of AI/ML model serving frameworks like MLflow, Kubeflow, TensorFlow Serving, or Triton Inference Server
  • Understanding of API design principles, load balancing, caching strategies, and performance optimization techniques
  • Experience with microservices architecture, distributed systems, and handling high-traffic, low-latency applications

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Infrastructure Engineer - GenAI

8 matching positions

ML Engineer Senior - GenAI Solutions

As a Senior Machine Learning Engineer at NTT DATA, you will work alongside exper...
Location
Location
Italy , Milano
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 5 years of production experience working in Data Science or Software Engineering
  • Deep knowledge of math, probability, statistics and algorithms
  • At least 6/12 months of experience in Generative AI deployment and underlying architecture handling
  • Vector Database knowledge is well appreciated
  • Understanding of data structures, data modeling and software architecture
  • Fluent in a at least two mainstream programming language (Python, Scala, Java, C++)
  • Experience in building an infrastructure for technical users, such as Data Scientist, ML practitioners or data consumers/producers
  • Strong knowledge of Spark, Databricks is a strong plus
  • Experience developing/deploying ML solutions in one of the public cloud platforms and on a Cross-cloud base, Snowflake knowledge is a plus
  • Deep knowledge with machine learning frameworks (such as Keras or PyTorch)
Job Responsibility
Job Responsibility
  • Apply hands-on Generative AI capabilities, preferably on Azure/GCP and on-premise GenAI architectures and MLOps
  • Leverage a strong mathematical background
  • Work on classification, information retrieval, clustering and optimization problems
  • Establish scalable, efficient and automated processes for large-scale data analysis
  • Contribute to model development, model validation and model implementation
  • Identify business opportunities
  • Design and create new data pipelines from scratch, from experiments to production deployment
  • Manage multiple projects
  • Lead ML Engineers
  • Connect with stakeholders
  • Fulltime
Read More
Arrow Right

Senior ML Engineer (GenAI, AWS)

Provectus helps companies adopt ML/AI to transform the ways they operate, compet...
Location
Location
Colombia , Medellín; Bogotá; Cali; Barranquilla; Bucaramanga
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • ML Fundamentals: supervised, unsupervised, and reinforcement learning
  • Model Development: feature engineering, model training, evaluation, hyperparameter tuning, and validation
  • ML Frameworks: classical ML libraries, TensorFlow, PyTorch, or similar frameworks
  • Deep Learning: CNNs, RNNs, Transformers
  • LLM Applications: Experience building production LLM-based applications
  • Prompt Engineering: Ability to design effective prompts and chain-of-thought strategies
  • RAG Systems: Experience building retrieval-augmented generation architectures
  • Vector Databases: Familiarity with embedding models and vector search
  • LLM Evaluation: Experience with evaluation metrics and techniques for LLM outputs
  • Python: Advanced proficiency in Python for ML applications
Job Responsibility
Job Responsibility
  • Design and implement end-to-end ML solutions from experimentation to production
  • Build scalable ML pipelines and infrastructure
  • Optimize model performance, efficiency, and reliability
  • Write clean, maintainable, production-quality code
  • Conduct rigorous experimentation and model evaluation
  • Troubleshoot and resolve complex technical challenges
  • Mentor junior and mid-level ML engineers
  • Conduct code reviews and provide constructive feedback
  • Share knowledge through documentation, presentations, and workshops
  • Collaborate with cross-functional teams (DevOps, Data Engineering, SAs)
What we offer
What we offer
  • Long-term B2B collaboration
  • Fully remote setup
  • A budget for your medical insurance
  • Paid sick leave, vacation, public holidays
  • Continuous learning support, including unlimited AWS certification sponsorship
  • Fulltime
Read More
Arrow Right

Senior ML Engineer (GenAI, AWS)

Location
Location
Colombia , Medellín; Bogotá; Cali; Barranquilla; Bucaramanga
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • ML Fundamentals: supervised, unsupervised, and reinforcement learning
  • Model Development: feature engineering, model training, evaluation, hyperparameter tuning, and validation
  • ML Frameworks: classical ML libraries, TensorFlow, PyTorch, or similar frameworks
  • Deep Learning: CNNs, RNNs, Transformers
  • LLM Applications: Experience building production LLM-based applications
  • Prompt Engineering: Ability to design effective prompts and chain-of-thought strategies
  • RAG Systems: Experience building retrieval-augmented generation architectures
  • Vector Databases: Familiarity with embedding models and vector search
  • LLM Evaluation: Experience with evaluation metrics and techniques for LLM outputs
  • Python: Advanced proficiency in Python for ML applications
Job Responsibility
Job Responsibility
  • Design and implement end-to-end ML solutions from experimentation to production
  • Build scalable ML pipelines and infrastructure
  • Optimize model performance, efficiency, and reliability
  • Write clean, maintainable, production-quality code
  • Conduct rigorous experimentation and model evaluation
  • Troubleshoot and resolve complex technical challenges
  • Mentor junior and mid-level ML engineers
  • Conduct code reviews and provide constructive feedback
  • Share knowledge through documentation, presentations, and workshops
  • Collaborate with cross-functional teams (DevOps, Data Engineering, SAs)
What we offer
What we offer
  • Long-term B2B collaboration
  • Fully remote setup
  • A budget for your medical insurance
  • Paid sick leave, vacation, public holidays
  • Continuous learning support, including unlimited AWS certification sponsorship
  • Fulltime
Read More
Arrow Right

Senior Ml Engineer (Genai, Aws)

Provectus helps companies adopt ML/AI to transform the ways they operate, compet...
Location
Location
Colombia , Medellín
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • ML Fundamentals: supervised, unsupervised, and reinforcement learning
  • Model Development: feature engineering, model training, evaluation, hyperparameter tuning, and validation
  • ML Frameworks: classical ML libraries, TensorFlow, PyTorch, or similar frameworks
  • Deep Learning: CNNs, RNNs, Transformers
  • LLM Applications: Experience building production LLM-based applications
  • Prompt Engineering: Ability to design effective prompts and chain-of-thought strategies
  • RAG Systems: Experience building retrieval-augmented generation architectures
  • Vector Databases: Familiarity with embedding models and vector search
  • LLM Evaluation: Experience with evaluation metrics and techniques for LLM outputs
  • Python: Advanced proficiency in Python for ML applications
Job Responsibility
Job Responsibility
  • Design and implement end-to-end ML solutions from experimentation to production
  • Build scalable ML pipelines and infrastructure
  • Optimize model performance, efficiency, and reliability
  • Write clean, maintainable, production-quality code
  • Conduct rigorous experimentation and model evaluation
  • Troubleshoot and resolve complex technical challenges
  • Mentor junior and mid-level ML engineers
  • Conduct code reviews and provide constructive feedback
  • Share knowledge through documentation, presentations, and workshops
  • Collaborate with cross-functional teams (DevOps, Data Engineering, SAs)
What we offer
What we offer
  • Long-term B2B collaboration
  • Fully remote setup
  • A budget for your medical insurance
  • Paid sick leave, vacation, public holidays
  • Continuous learning support, including unlimited AWS certification sponsorship
  • Fulltime
Read More
Arrow Right

Senior ML Engineer (GenAI, AWS)

Provectus helps companies adopt ML/AI to transform the ways they operate, compet...
Location
Location
Colombia , Medellín
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • ML Fundamentals: supervised, unsupervised, and reinforcement learning
  • Model Development: feature engineering, model training, evaluation, hyperparameter tuning, and validation
  • ML Frameworks: classical ML libraries, TensorFlow, PyTorch, or similar frameworks
  • Deep Learning: CNNs, RNNs, Transformers
  • LLM Applications: Experience building production LLM-based applications
  • Prompt Engineering: Ability to design effective prompts and chain-of-thought strategies
  • RAG Systems: Experience building retrieval-augmented generation architectures
  • Vector Databases: Familiarity with embedding models and vector search
  • LLM Evaluation: Experience with evaluation metrics and techniques for LLM outputs
  • Python: Advanced proficiency in Python for ML applications
Job Responsibility
Job Responsibility
  • Design and implement end-to-end ML solutions from experimentation to production
  • Build scalable ML pipelines and infrastructure
  • Optimize model performance, efficiency, and reliability
  • Write clean, maintainable, production-quality code
  • Conduct rigorous experimentation and model evaluation
  • Troubleshoot and resolve complex technical challenges
  • Mentor junior and mid-level ML engineers
  • Conduct code reviews and provide constructive feedback
  • Share knowledge through documentation, presentations, and workshops
  • Collaborate with cross-functional teams (DevOps, Data Engineering, SAs)
What we offer
What we offer
  • Long-term B2B collaboration
  • Fully remote setup
  • A budget for your medical insurance
  • Paid sick leave, vacation, public holidays
  • Continuous learning support, including unlimited AWS certification sponsorship
  • Fulltime
Read More
Arrow Right

GenAI Senior Platform Engineer - Python, VP

Citi's global Innovation Labs is seeking a versatile Senior GenAI Platform Engin...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in the software industry, with a strong emphasis on building enterprise software
  • 6+ years of relevant experience developing and implementing scalable and robust platforms, applications, and services using modern libraries and frameworks (e.g., Python: FastAPI, Flask, Pandas, Scikit-learn, Hugging Face
  • Node.js: Express, NestJS
  • TypeScript)
  • 5+ years of experience delivering complex backend solutions and services (e.g., APIs, microservices) into production
  • Demonstrated experience in managing and implementing successful projects of varying sizes and complexities
  • Proven understanding of Generative AI systems, AIOps, and application monitoring/evaluation
  • Experience with cloud architectures, with specific experience in public cloud offerings
  • Strong passion and proven hands-on experience integrating with AI/ML technologies
  • Experience with software development agents, agile development, CI/CD pipelines, software testing, and code reviews
Job Responsibility
Job Responsibility
  • Lead the design, development, and maintenance of highly complex GenAI platforms, applications, and services using Python, Node.js, and TypeScript
  • Ensure the seamless operation, scalability, and integration of AI capabilities across various Citi business units
  • Engage with data science, technical, and business stakeholders to define and design the overall architecture for key use-cases
  • Drive the deployment of new GenAI products and process improvements, working with internal and external partners to design, validate, and deliver solutions
  • Resolve high-impact technical and business problems, leading projects through in-depth evaluation of complex business processes, system architecture, and industry standards
  • Provide expert guidance and advanced knowledge in modern programming, ensuring platform design adheres to architectural blueprints and best practices for generative models
  • Develop and enforce robust coding standards, testing methodologies, debugging practices, and implementation strategies for enterprise-grade solutions across Python, Node.js, and TypeScript
  • Manage multiple concurrent initiatives and projects of varying sizes and complexity
  • Engage with external vendors and startups for joint initiatives and exploration of new technologies
  • Cultivate a comprehensive understanding of how business, architecture, and infrastructure integrate within the GenAI ecosystem at Citi
What we offer
What we offer
  • Discover the top benefits offered to our global workforce, designed to support your well-being, growth and work-life balance
  • Fulltime
Read More
Arrow Right

Senior ML Engineer (GenAI)

Provectus helps companies adopt ML/AI to transform the ways they operate, compet...
Location
Location
Colombia , Medellín; Bogotá; Cali; Barranquilla; Bucaramanga
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • ML Fundamentals: supervised, unsupervised, and reinforcement learning
  • Model Development: feature engineering, model training, evaluation, hyperparameter tuning, and validation
  • ML Frameworks: classical ML libraries, TensorFlow, PyTorch, or similar frameworks
  • Deep Learning: CNNs, RNNs, Transformers
  • LLM Applications: Experience building production LLM-based applications
  • Prompt Engineering: Ability to design effective prompts and chain-of-thought strategies
  • RAG Systems: Experience building retrieval-augmented generation architectures
  • Vector Databases: Familiarity with embedding models and vector search
  • LLM Evaluation: Experience with evaluation metrics and techniques for LLM outputs
  • Python: Advanced proficiency in Python for ML applications
Job Responsibility
Job Responsibility
  • Design and implement end-to-end ML solutions from experimentation to production
  • Build scalable ML pipelines and infrastructure
  • Optimize model performance, efficiency, and reliability
  • Write clean, maintainable, production-quality code
  • Conduct rigorous experimentation and model evaluation
  • Troubleshoot and resolve complex technical challenges
  • Mentor junior and mid-level ML engineers
  • Conduct code reviews and provide constructive feedback
  • Share knowledge through documentation, presentations, and workshops
  • Collaborate with cross-functional teams (DevOps, Data Engineering, SAs)
What we offer
What we offer
  • Long-term B2B collaboration
  • Fully remote setup
  • A budget for your medical insurance
  • Paid sick leave, vacation, public holidays
  • Continuous learning support, including unlimited AWS certification sponsorship
Read More
Arrow Right

Full-Stack Senior Software Engineer, GenAI Data Products and Platform (VP)

This is your chance to build the foundational systems for 'Citi Assist', a Gener...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience building and deploying production applications across the full stack
  • Proficiency in multiple languages, including Python and TypeScript/JavaScript (experience with Go or Java is a plus)
  • Deep experience working with data—whether that's building data pipelines, designing analytics systems, or creating data-driven products
  • Experience building user-facing features with modern frontend frameworks like React, Vue, or Angular
  • Strong SQL skills and experience with relational databases like Postgres
  • Experience building backend services and APIs that handle data at scale
  • Comfort with containerised environments and cloud infrastructure (we use OpenShift/Kubernetes)
  • Strong understanding of CI/CD pipelines, testing frameworks, and automation
  • Experience with data visualisation tools and techniques
Job Responsibility
Job Responsibility
  • Build the tools that make Assist great
  • Work with data across the full stack
  • Own your features end to end
  • Build with safety and quality in mind
  • Set the technical direction
  • Be a great teammate
What we offer
What we offer
  • 27 days annual leave (plus bank holidays)
  • A discretional annual performance related bonus
  • Private Medical Care & Life Insurance
  • Employee Assistance Program
  • Pension Plan
  • Paid Parental Leave
  • Special discounts for employees, family, and friends
  • Fulltime
Read More
Arrow Right