Ai Ops Ml Ops Engineer Job at Whitehall Resources Ltd

Senior Software Engineer, AI & ML Ops

Hyundai AutoEver America seeks a seasoned Senior AI/ML Engineer to architect, de...

Location

United States , Irvine

Salary:

103170.00 - 158873.00 USD / Year

Hyundai AutoEver America

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s degree in Computer Science, Engineering, AI, or related field
advanced degrees/certifications are a plus
8+ years of software engineering experience, including 3+ years in AI/ML solution development
Proven experience designing and deploying LLM-based solutions, traditional ML models, RAG systems, and agent workflows
Strong expertise in Python, TensorFlow/PyTorch, Hugging Face, prompt engineering, vector databases, and AI orchestration
Hands-on experience with AWS SageMaker/Bedrock, Azure OpenAI, or Azure ML Studio, plus MLOps best practices (CI/CD, testing, model monitoring)
Proficiency in frontend frameworks (React), cloud-native deployment (Docker/Kubernetes), microservice APIs, and relational/NoSQL databases

Job Responsibility

Architect and develop scalable AI/ML and LLM-based systems, including RAG pipelines, agentic workflows, predictive models, and generative AI solutions
Build full‑stack AI applications, including React-based dashboards and front‑end interfaces integrated with backend services and cloud infrastructure
Develop data pipelines and ML Ops workflows using Python, SQL, AWS/Azure platforms, and monitoring tools to train, deploy, and optimize models
Lead cross-functional AI initiatives, deliver PoCs/MVPs, ensure compliance with AI governance, and integrate AI features into enterprise and user-facing systems
Provide technical leadership and mentorship, guiding standards, code reviews, model documentation, and best practices in AI/ML development
Continuously improve AI performance and reliability through prompt engineering, architecture enhancements, and data optimization

What we offer

comprehensive medical/dental coverage
generous PTO
education assistance
annual merit increase eligibility

Fulltime

Ml Ops Engineer

We are hiring a ML Ops Engineer for our GCC client — Europe’s top retail brands....

Location

India , Bangalore

Salary:

Not provided

SRKay Consulting Group

Expiration Date

Until further notice

Requirements

Workflow Management: Experience in managing Apache Airflow and Composer to support the Data Engineering components of grounded AI solutions
MLflow: Deep knowledge of MLflow Tracking, Projects, and Registry. Experience migrating MLflow backends between cloud providers
Workflow Tools: Familiarity with Vertex AI Pipelines and Azure DevOps for automation
GCP AI Services: Practical experience with Vertex AI (Workbench, Model Garden, Feature Store) and BigQuery ML
Containerization: Expert-level Docker and Kubernetes (GKE/AKS) skills. Must understand K8s operators and resource management for ML workloads
Infrastructure as Code (IaC): Proficiency in Terraform to manage reproducible cloud environments
Programming: Advanced Python skills with a focus on software engineering best practices (unit testing, modular design)
Data Engineering: Experience with Change Data Capture (CDC), Spark/PySpark, and optimizing data flow from BigQuery to training nodes
Access Control: Knowledge of IAM roles, VPC Service Controls, and securing ML endpoints
Experience with LLMOps (managing large-scale foundation models, prompt versioning, and vector database scaling)

Job Responsibility

Pipeline Orchestration: Design, develop, and maintain complex ML workflows using Apache Airflow (Cloud Composer) to automate data ingestion, preprocessing, and model training
Lifecycle Management: Administer and scale MLflow for experiment tracking, model packaging, and maintaining a centralized Model Registry across the organization
Cloud & Hybrid Ops: Create and optimize training environments for custom ML/LLM models
Model Serving & Scaling: Architect high-performance inference endpoints and serve models via FastAPI/Flask with API Gateway
Infrastructure Management: Manage auto-scaling CUDA clusters on Google Kubernetes Engine (GKE)
CI/CD: Manage end-to-end delivery with Continuous Integration & Continuous Delivery (CI/CD)
Observability & Monitoring: Build dashboards to track model health, latency, and data drift

Fulltime

ML Ops Engineer

The MLOps Engineer will work closely with the Data Science, Analytics, and Data ...

Location

United States

Salary:

127000.00 - 160550.00 USD / Year

Zelis

Expiration Date

Until further notice

Requirements

2–5 years of experience in ML Ops, ML Engineering, or a related role with a focus on production-level model monitoring, automation, and deployment
Strong experience with ML observability tools or custom-built monitoring systems
Experience with monitoring LLMs and Generative AI models, including prompt evaluation, hallucination tracking, and agent behavior auditing
Experience in deploying and managing ML workloads using containerization and orchestration platforms such as Docker, Kubernetes, Kubeflow, or TensorFlow Extended
Familiarity with AutoML pipelines and workflow management tools (e.g., MLflow, SageMaker Autopilot)
Experience working in cloud environments, preferably AWS (e.g., SageMaker, S3, Lambda, ECS/EKS)
Understanding of ML lifecycle tools (e.g., MLflow, SageMaker Pipelines) and CI/CD practices
Strong security and compliance awareness, particularly related to model/data governance (e.g., HIPAA, GDPR)
Proficiency in Python and key data libraries (Pandas, Numpy, Matplotlib, etc.)
Advanced SQL skills and experience with Snowflake or similar data warehousing platforms

Job Responsibility

Build and maintain monitoring infrastructure for conventional machine learning models, with capabilities for performance tracking, drift detection, and alerting
Research, evaluate, and implement monitoring strategies and tools for Generative AI systems, including LLMs and Agentic AI architectures
Collaborate with ML Engineers, Data Scientists, and DevOps teams to deploy, manage, and monitor models in production
Develop and support scalable, secure, and automated data pipelines using Snowflake, SQL, and Python for training, serving, and monitoring ML and GenAI models
Leverage AutoML tools and frameworks (e.g., MLflow, Kubeflow, SageMaker Autopilot) to streamline experimentation and deployment
Design dashboards and reporting systems to visualize model health metrics and surface key operational insights
Ensure auditability, reproducibility, and compliance for model performance and data flow in production environments, with consideration for regulatory standards like GDPR and HIPAA
Maintain CI/CD workflows and version-controlled codebases (e.g., Git) for ML infrastructure and pipelines
Utilize containerization and orchestration technologies (e.g., Docker) to manage scalable ML infrastructure
Leverage tools such as Streamlit and Python visualization libraries to present insights from model and data monitoring

What we offer

401k plan with employer match
flexible paid time off
holidays
parental leaves
life and disability insurance
health benefits including medical, dental, vision, and prescription drug coverage

Fulltime

Ai/ Ml Engineer

Octopus was founded with a mission to use technology to accelerate us towards a ...

Location

United Kingdom , London

Salary:

Not provided

Octopus Energy

Expiration Date

Until further notice

Requirements

Deep Understanding of GenAI - experience working with LLMs
Data Product Development - experience building Python-based applications and/or data products, with hands-on work in data-intensive and machine learning systems
AI model evaluation and observability - Experience of different ways of evaluating AI models and applications. Implementing logging, tracing, and monitoring in systems
Context Engineering and Knowledge Grounding - Experience of optimising and grounding GenAI models and applications through prompt design, RAG and knowledge base integration
Software Development Practices - Strong grounding in Git, testing, CI/CD frameworks
Ability to thrive in a fast moving environment - Dealing with ambiguity, setting clear priorities, and translating ideas into actionable plans

Job Responsibility

Design and Develop AI Platform Services - Build reusable, scalable services that expose GenAI models, knowledge retrieval pipelines, and agent workflows to application teams
Knowledge Base Development - Build and maintain knowledge retrieval systems including embedding generation, chunking, and strategies for database management
AI Ops, evals and observability - Setting up frameworks for monitoring and evaluating AI output quality (relevance, accuracy, safety, drift, cost) and platform observability (latency, cost, usage)
Context Engineering - Design systems for prompt assembly: Create prompt templates, system prompts and guidelines for platform users

Fulltime

ML Ops Engineer

As an MLOps Engineer, you will be responsible for building, maintaining, and opt...

Location

India , Hyderabad

Salary:

Not provided

NStarX

Expiration Date

Until further notice

Requirements

4 to 10 years of experience in MLOps, DevOps, or ML Engineering
Strong proficiency with cloud platforms such as AWS, Azure, or GCP
Experience with containerization and orchestration tools like Docker and Kubernetes
Hands-on experience with ML model deployment, monitoring, and scaling
Proficiency with CI/CD tools such as Jenkins or GitLab CI
Familiarity with data versioning and management tools such as DVC
Strong coding skills in Python with knowledge of ML libraries like TensorFlow or PyTorch
Strong problem-solving skills and ability to work in a collaborative environment
Effective communication skills for cross-functional teamwork

Job Responsibility

Develop and manage infrastructure for end-to-end ML workflows including model training, deployment, monitoring, and maintenance
Implement CI/CD pipelines for ML models and data workflows
Collaborate with cross-functional teams to build scalable and robust ML infrastructure on cloud and on-premises environments
Monitor and optimize model performance and infrastructure to ensure efficient resource usage
Manage data versioning and model versioning across multiple environments
Implement security, governance, and compliance protocols in ML deployment and data pipelines
Support troubleshooting, debugging, and incident management for ML infrastructure issues

What we offer

Competitive compensation
Opportunity to work with a dynamic team on cutting-edge AI and ML solutions
Professional growth and development opportunities

Fulltime

Senior ML Ops Engineer

Join Elsevier as a Senior ML Ops Engineer to lead the development of impactful A...

Location

United States , Philadelphia

Salary:

95300.00 - 158800.00 USD / Year

EdTech Jobs

Expiration Date

Until further notice

Requirements

Current experience in ML Engineering, MLOps platforms, shipping ML or search/GenAI systems to production
Strong Python, Java, and/or Scala experience
Hands-on experience with major cloud vendor solutions (AWS, Azure and/or Google)
Experience with Search/vector/graph technologies (e.g., Elasticsearch / OpenSearch / Solr / Neo4j)
Experience in evaluating LLM models
A strong understanding of the Data Science Life Cycle including feature engineering, model training, and evaluation metrics
Familiarity with ML frameworks, e.g., PyTorch, TensorFlow, PySpark
Experience with large-scale data processing systems, e.g., Spark
Experience with statistical analysis, machine learning theory and natural language processing

Job Responsibility

Automate and orchestrate machine learning workflows across major cloud and AI platforms (AWS, Azure, Databricks, and foundation model APIs such as OpenAI)
Maintain and version model registries and artifact stores to ensure reproducibility and governance
Develop and manage CI/CD for ML, including automated data validation, model testing, and deployment
Implement ML Engineering solutions using popular MLOps platforms such as AWS SageMaker, MLflow, Azure ML
Scale end-end custom Sagemaker pipelines
Design and implement the engineering components of GAR+RAG systems (e.g., query interpretation and reflection, chunking, embeddings, hybrid retrieval, semantic search), manage prompt libraries, guardrails and structured output for LLMs hosted on Bedrock/SageMaker or self-hosted
Design and implement ML pipelines that utilize Elasticsearch/OpenSearch/Solr, vector DBs, and graph DBs
Build evaluation pipelines: offline IR metrics (NDCG, MAP, MRR), LLM quality metrics (faithfulness, grounding), and A/B testing
Optimize infrastructure costs through monitoring, scaling strategies, and efficient resource utilization
Stay current with the latest GAI research, NLP and RAG and apply the state-of-the-art in our experiments and systems

What we offer

Annual incentive bonus
Country specific benefits
Fair and accessible hiring process with accommodation support

Fulltime

Vice President ML Ops Engineer

Embark on a transformative journey as Vice President- ML Operations Engineer at ...

Location

India , Noida

Salary:

Not provided

Barclays

Expiration Date

Until further notice

Requirements

Experience in Programming & Automation: Python, Bash, SQL
MLOps Tools: MLflow, Kubeflow, AWS SageMaker Pipelines
Cloud Platforms: AWS (SageMaker, Bedrock, Lambda, Step Functions, CloudWatch)
DevOps Expertise: CI/CD (GitHub Actions, Jenkins), Docker, Kubernetes
Data Management: Enterprise data governance, ETL processes
Leadership Skills: Strategic planning, team management, stakeholder communication

Job Responsibility

Definition and oversight of data governance and procedures to address control and regulatory requirements, including Data Privacy
Definition and oversight of data analytics and insights that support the effective management of the business and well as driving commercial outcomes
Development of a team of data professionals with expertise in data analytics, data engineering, data science, and other relevant disciplines
Analysis of the bank's current data landscape, identify key data assets and gaps, and develop a roadmap for future data initiatives
Monitoring team performance and setting clear performance expectations
Lead the design and governance of MLOps frameworks, AWS-based architectures, and automation strategies to enable efficient, secure, and scalable deployment of AI and Generative AI models

What we offer

Hybrid working
Modern workspaces, collaborative areas, and state-of-the-art meeting rooms
Facilities include wellness rooms, on-site cafeterias, fitness centers, and tech-equipped workstations

Fulltime

Ai Ops Platform Engineer

Join us as an AI Ops Engineer, to build and run an enterprise AI Factory within ...

Location

United Kingdom , London

Salary:

Not provided

Barclays

Expiration Date

Until further notice

Requirements

LLMOps / MLOps at production scale, operating the full Generative AI lifecycle including models, prompts and agents, CI/CD pipelines, structured evaluation, drift and hallucination monitoring, and controlled, auditable release processes suitable for banking environments
Cloud‑native AI platform engineering on AWS, with hands‑on delivery using services such as Amazon Bedrock for foundation models, agent orchestration patterns, Lambda and Step Functions, alongside demonstrated Python engineering capability and secure microservices and API design
AI governance, observability and cost optimisation, embedding governance by design through policy as code, alignment to model risk framework expectations, lifecycle traceability and audit‑ready evidence, supported by SRE‑grade monitoring and ongoing optimisation of token usage and compute cost across AI workloads

Job Responsibility

Build and run an enterprise AI Factory within our Card Merchant Services organisation, enabling AI‑driven change across the merchant payments lifecycle
Accountable for the end‑to‑end operationalisation of AI, spanning model, prompt, and agent lifecycles
deployment and monitoring
guardrails
and cost optimisation, ensuring AI solutions are production‑ready, auditable, compliant, and scalable across merchant payment use cases
Accountable for the end‑to‑end engineering of GenAI and ML platforms, embedding governance, observability and operational resilience by design, while enabling teams to deploy and run AI solutions with clarity, assurance and accountability at scale
Lead and manage engineering teams, providing technical guidance, mentorship, and support to ensure the delivery of high-quality software solutions
Oversee timelines, team allocation, risk management and task prioritization
Mentor and support team members' professional growth, conduct performance reviews, provide actionable feedback, and identify opportunities for improvement
Evaluation and enhancement of engineering processes, tools, and methodologies

What we offer

Competitive holiday allowance
Life assurance
Private medical care
Pension contribution

Fulltime

Select Country

Ai Ops Ml Ops Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?