ML Ops Engineer Job at NStarX (Hyderabad)

Ml Ops Engineer

We are hiring a ML Ops Engineer for our GCC client — Europe’s top retail brands....

Location

India , Bangalore

Salary:

Not provided

SRKay Consulting Group

Expiration Date

Until further notice

Requirements

Workflow Management: Experience in managing Apache Airflow and Composer to support the Data Engineering components of grounded AI solutions
MLflow: Deep knowledge of MLflow Tracking, Projects, and Registry. Experience migrating MLflow backends between cloud providers
Workflow Tools: Familiarity with Vertex AI Pipelines and Azure DevOps for automation
GCP AI Services: Practical experience with Vertex AI (Workbench, Model Garden, Feature Store) and BigQuery ML
Containerization: Expert-level Docker and Kubernetes (GKE/AKS) skills. Must understand K8s operators and resource management for ML workloads
Infrastructure as Code (IaC): Proficiency in Terraform to manage reproducible cloud environments
Programming: Advanced Python skills with a focus on software engineering best practices (unit testing, modular design)
Data Engineering: Experience with Change Data Capture (CDC), Spark/PySpark, and optimizing data flow from BigQuery to training nodes
Access Control: Knowledge of IAM roles, VPC Service Controls, and securing ML endpoints
Experience with LLMOps (managing large-scale foundation models, prompt versioning, and vector database scaling)

Job Responsibility

Pipeline Orchestration: Design, develop, and maintain complex ML workflows using Apache Airflow (Cloud Composer) to automate data ingestion, preprocessing, and model training
Lifecycle Management: Administer and scale MLflow for experiment tracking, model packaging, and maintaining a centralized Model Registry across the organization
Cloud & Hybrid Ops: Create and optimize training environments for custom ML/LLM models
Model Serving & Scaling: Architect high-performance inference endpoints and serve models via FastAPI/Flask with API Gateway
Infrastructure Management: Manage auto-scaling CUDA clusters on Google Kubernetes Engine (GKE)
CI/CD: Manage end-to-end delivery with Continuous Integration & Continuous Delivery (CI/CD)
Observability & Monitoring: Build dashboards to track model health, latency, and data drift

Fulltime

ML Ops Engineer

Location

India , Miracle Heights

Salary:

Not provided

Miracle Software Systems

Expiration Date

Until further notice

Requirements

Java
Python
SQL
GCP
Dockers
Terraform
Exp 7-10 Years

Job Responsibility

Design, build, and maintain end-to-end MLOps pipelines for model training, deployment, monitoring, and continuous improvement in production environments
Develop backend services and APIs using Python and Java frameworks to operationalize machine learning models
Implement automated CI/CD workflows for machine learning and data applications
Manage the full model lifecycle, including feature engineering integration, model registry management, version control, and performance tracking
Deploy and operate machine learning workloads on Google Cloud Platform using BigQuery, GCS, Dataflow, and Dataproc
Deploy applications packaged using Docker and orchestrate deployments with Kubernetes
Implement Infrastructure as Code using Terraform for reproducible environment provisioning
Establish model observability practices, including drift detection, performance monitoring, and operational reliability controls
Collaborate with data scientists, platform engineers, and product teams within Agile delivery environments
Maintain SDLC best practices, including source control, security validation, static analysis, and automated quality checks using GitHub, Tekton, SonarQube, 42Crunch, and FOSSA

ML Ops Engineer

The MLOps Engineer will work closely with the Data Science, Analytics, and Data ...

Location

United States

Salary:

127000.00 - 160550.00 USD / Year

Zelis

Expiration Date

Until further notice

Requirements

2–5 years of experience in ML Ops, ML Engineering, or a related role with a focus on production-level model monitoring, automation, and deployment
Strong experience with ML observability tools or custom-built monitoring systems
Experience with monitoring LLMs and Generative AI models, including prompt evaluation, hallucination tracking, and agent behavior auditing
Experience in deploying and managing ML workloads using containerization and orchestration platforms such as Docker, Kubernetes, Kubeflow, or TensorFlow Extended
Familiarity with AutoML pipelines and workflow management tools (e.g., MLflow, SageMaker Autopilot)
Experience working in cloud environments, preferably AWS (e.g., SageMaker, S3, Lambda, ECS/EKS)
Understanding of ML lifecycle tools (e.g., MLflow, SageMaker Pipelines) and CI/CD practices
Strong security and compliance awareness, particularly related to model/data governance (e.g., HIPAA, GDPR)
Proficiency in Python and key data libraries (Pandas, Numpy, Matplotlib, etc.)
Advanced SQL skills and experience with Snowflake or similar data warehousing platforms

Job Responsibility

Build and maintain monitoring infrastructure for conventional machine learning models, with capabilities for performance tracking, drift detection, and alerting
Research, evaluate, and implement monitoring strategies and tools for Generative AI systems, including LLMs and Agentic AI architectures
Collaborate with ML Engineers, Data Scientists, and DevOps teams to deploy, manage, and monitor models in production
Develop and support scalable, secure, and automated data pipelines using Snowflake, SQL, and Python for training, serving, and monitoring ML and GenAI models
Leverage AutoML tools and frameworks (e.g., MLflow, Kubeflow, SageMaker Autopilot) to streamline experimentation and deployment
Design dashboards and reporting systems to visualize model health metrics and surface key operational insights
Ensure auditability, reproducibility, and compliance for model performance and data flow in production environments, with consideration for regulatory standards like GDPR and HIPAA
Maintain CI/CD workflows and version-controlled codebases (e.g., Git) for ML infrastructure and pipelines
Utilize containerization and orchestration technologies (e.g., Docker) to manage scalable ML infrastructure
Leverage tools such as Streamlit and Python visualization libraries to present insights from model and data monitoring

What we offer

401k plan with employer match
flexible paid time off
holidays
parental leaves
life and disability insurance
health benefits including medical, dental, vision, and prescription drug coverage

Fulltime

ML Ops Engineer

The ML Ops Engineer role involves building ML pipelines, deploying models, and d...

Location

Mexico , Guadalajara

Salary:

Not provided

NTT DATA

Expiration Date

Until further notice

Requirements

5+ Years as an ML Ops Engineer
Proficiency in AWS SageMaker and AWS Cloud Services
Experience with ML lifecycle tools (e.g., MLflow, Kubeflow)
Familiarity with Weights & Biases for experiment tracking
Hands-on with Databricks for scalable data and ML workflows
Strong Python programming skills
Experience in Developing GitHub Actions using Typescript for CICD
Experience with Kubernetes for container orchestration
Understanding of Edge ML deployment strategies
Expertise in ML training and inference workflows

Job Responsibility

Build ML Pipelines and deploy models
Define and develop APIs and MCP Serverd
Working on projects leveraging your expertise in data science, artificial-intelligence and machine learning
Assist in breaking down complex business problems, developing solutions, and delivering with a high degree of focus on client satisfaction
Conduct market research, develop a point-of-view and communicate effectively back to clients and stakeholders
Bring innovative thinking, resourcefulness leveraging best practices and creativity to achieve successful client outcomes
Establish relationships with our clients at the appropriate levels, gain an understanding of the project work and problems encountered
Work with data sets of varying degrees of size and complexity including both structured and unstructured data
Piping and processing massive data-streams in distributed computing environments
Implement batch and real-time model scoring

Ai Ops Ml Ops Engineer

Whitehall Resources are currently looking for a AI Ops ML Ops Engineer. Key Requ...

Location

Salary:

Not provided

Whitehall Resources Ltd

Expiration Date

Until further notice

Requirements

Min 7+ years of Experience in ML Ops, DevOps, AI/ML deployment, monitoring, cloud platforms and production support
Build strong cross-functional ways of working across Data & AI, IT, Digital and business teams so delivery is aligned, practical and business-led
Keep the internal and external customer experience at the center of data, analytics and AI delivery, with focus on reliable outcomes and decision support
Continuously build capability in modern data, analytics and AI practices and actively share knowledge with peers and business users
Apply structured problem solving to simplify complex data, process and technology issues and remove barriers to execution
Identify practical opportunities to improve business performance using modern data platforms, analytics, automation, GenAI and embedded AI capabilities
Adjust priorities and delivery approach in a dynamic business environment while maintaining governance, quality and business continuity
Deploy, monitor and manage ML models and AI services across the lifecycle
Apply release, versioning, automation and controlled deployment practices
Monitor uptime, drift, performance, data quality and operational metrics

Job Responsibility

Operationalizes, monitors and supports AI/ML solutions in production
Ensures models, pipelines and AI services are deployed, monitored, governed and maintained with reliable operational practices
Implement ML Ops and deployment practices
Monitor model and solution health
Support production AI/ML systems
Ensure auditability and governance of AI operations
Improve automation and reliability

Senior ML Ops Engineer - Architecture & Strategy

We own the platform blueprint for our ML infrastructure: designing systems that ...

Location

Germany , Munich

Salary:

Not provided

BMW

Expiration Date

Until further notice

Requirements

University degree in Computer Science, Computer/Electrical Engineering or related subjects
5–8+ years in ML platform or infrastructure engineering, with at least two years in a tech lead or architect role
Deep expertise in either AWS, Azure or Google cloud, ideally with multi-region or multi-account setups
Proven track record designing systems for PB-scale data and hundreds of concurrent training jobs as well as understanding of large vision models and the challenges of compressing them for automotive-grade SoCs
Strong knowledge of Kubernetes platform design, GitOps, and infrastructure-as-code
Excellent communication skills to align ML researchers, embedded engineers, data teams, and executives
Familiarity with edge model compilation toolchains for Qualcomm (QNN, AIMET) and/or NVIDIA (TensorRT, Triton) and experience with automotive data at scale, such as MDF4, MCAP, ROS bags, and multi-sensor synchronisation

Job Responsibility

You design the reference architecture for the ML platform end-to-end: data ingestion, PB-scale data lake, heterogeneous training clusters, model registry, and deployment-ready artefacts
You design the data-format backbone, setting standards for data flows, ingestion, cataloguing, transcoding, and partitioning at PB scale, integrated with dataset management tooling
You define the platform component topology and integration contracts for pipeline orchestration, experiment tracking, hyperparameter optimisation, dataset management, observability, and metadata
You establish model lifecycle governance, including experiment tracking, approval gates, validation criteria, and clear handoff contracts to deployment teams
You drive cost governance at PB scale, including accelerator spot strategies, S3 tiering, cross-AZ traffic reduction, and Kubernetes cluster right-sizing
You partner with Security, Legal, and Functional-Safety teams on ISO 26262, ISO 8800, and data-protection compliance

What we offer

Challenging projects with which we shape the mobility of tomorrow together
Wide range of personal and professional development opportunities
Attractive, fair and performance-related remuneration
High level of job security
Annual special payments such as vacation pay, Christmas bonus, and profit sharing
Flexible working hours including six weeks annual leave and overtime compensation
Discounted BMW & MINI conditions
Many other benefits at bmw.jobs/benefits

Fulltime

ML Ops Engineer, Central Software

Boston Dynamics’ mission is to image and create robots that enrich people’s live...

Location

United States , Waltham

Salary:

Not provided

Boston Dynamics

Expiration Date

Until further notice

Requirements

7+ years experience as an ML Platform engineer
Demonstrated expert-level proficiency in Python (mandatory) and system programming (e.g., Go, C++, Rust)
Demonstrated proficiency managing and configuring cloud resources Infrastructure as Code (e.g., Terraform, Ansible)
Expert in scalable ML deployments via kubernetes
Hands-on knowledge of designing pipelines to automate code deployment, model training, and validation using Argo CD, CI, etc.
Configuring and using observability metrics to drive improvements in system metrics (e.g., CPU/GPU/Latency/Performance)
Experience managing/operating in hybrid hosted/on-prem compute environments
Experience working collaboratively in cross-functional team using Agile, Scrum, or other lean approach
Bachelors in Engineering, Computer Science, or other technical area

Job Responsibility

ML Operations: Evolve/scale/optimize fielded solutions and enable orchestration of ML training workloads on GPU clusters
ML Infrastructure Support: Work closely with others to implement, deploy, and maintain ML infrastructure
New Capabilities: Transform proofs of concepts into scalable solutions, helping deliver new robot capabilities to customers
Engagement: Work with stakeholders across BD to understand requirements, ensuring deployed solutions meet end-user needs
Ownership: Own the end-to-end spanning implementation, testing, deploying, and monitoring
Coordination: Participate in agile development process, work with others, identify challenges, and regularly communicate progress
Mentorship: Use your experience to mentor/upskill peers and other contributors across the organization

Fulltime

Data Engineer / ML Ops

As our Data Engineer, you will design, build, and maintain the data infrastructu...

Location

Germany , Berlin; Potsdam

Salary:

Not provided

Sensmore GmbH

Expiration Date

Until further notice

Requirements

3+ years of hands-on experience building production data pipelines in the cloud (AWS, GCP, or Azure)
Proficiency in Python, SQL, and at least one big-data framework
Familiarity with ML Ops tooling: DVC, MLflow, Kubeflow, or similar
Experience designing and operating data warehouses/data lakes (e.g., Redshift, Snowflake, BigQuery, Delta Lake)
Strong understanding of distributed systems, data serialization (Parquet, Avro), and batch vs. streaming paradigms
Excellent problem-solving skills and the ability to work in ambiguous, fast-paced environments

Job Responsibility

Build & operate data pipelines: Ingest, process, and transform multi-sensor telemetry (radar point-clouds, video frames, log streams) into analytics-ready and ML-ready formats
Design scalable storage: Architect high-throughput, low-latency data lakes and warehouses (e.g., S3, Delta Lake, Redshift/Snowflake)
Enable ML Ops workflows: Integrate DVC or MLflow, automate model training/retraining triggers, track data/model lineage
Ensure data quality: Implement validation, monitoring, and alerting to catch anomalies and schema changes early
Collaborate cross-functionally: Partner with Embedded Systems, Robotics, and Software teams to align on data schemas, APIs, and real-time requirements
Optimize performance: Tune distributed processing, queries, and storage layouts for cost-efficiency and throughput
Document & evangelize: Maintain clear documentation for data schemas, pipeline architectures, and ML Ops practices to uplift the whole team

What we offer

Attractive compensation package and stock options
Beverages on-site and regular social events
Engage with top-tier researchers, engineers, and thought leaders
Influence the future of robotic technologies and tackle significant technological challenges
Assistance with relocation to Berlin

Fulltime

Select Country

ML Ops Engineer

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

ML Ops Engineer

Ml Ops Engineer

ML Ops Engineer

ML Ops Engineer

ML Ops Engineer

Ai Ops Ml Ops Engineer

Senior ML Ops Engineer - Architecture & Strategy

ML Ops Engineer, Central Software

Data Engineer / ML Ops

Our AI answers in your language