CrawlJobs Logo

Sr. Engineer, ML Platform

United Kingdom, London · Job Posted January 05, 2026
Apply Position
Job Link Share

Job Description

As the leading delivery platform in the region, we have a unique responsibility and opportunity to positively impact millions of customers, restaurant partners, and riders. To achieve our mission, we must scale and continuously evolve our machine learning capabilities, including cutting-edge Generative AI (genAI) initiatives. This demands robust, efficient, and scalable ML platforms that empower our teams to rapidly develop, deploy, and operate intelligent systems. As an ML Platform Engineer, your mission is to design, build, and enhance the infrastructure and tooling that accelerates the development, deployment, and monitoring of traditional ML and genAI models at scale. You’ll collaborate closely with data scientists, ML engineers, genAI specialists, and product teams to deliver seamless ML workflows—from experimentation to production serving—ensuring operational excellence across our ML and genAI systems.

Job Responsibility

  • Design, build, and maintain scalable, reusable, and reliable ML platforms and tooling that support the entire ML lifecycle, including data ingestion, model training, evaluation, deployment, and monitoring for both traditional and generative AI models
  • Develop standardized ML workflows and templates using MLflow and other platforms, enabling rapid experimentation and deployment cycles
  • Implement robust CI/CD pipelines, Docker containerization, model registries, and experiment tracking to support reproducibility, scalability, and governance in ML and genAI
  • Collaborate closely with genAI experts to integrate and optimize genAI technologies, including transformers, embeddings, vector databases (e.g., Pinecone, Redis, Weaviate), and real-time retrieval-augmented generation (RAG) systems
  • Automate and streamline ML and genAI model training, inference, deployment, and versioning workflows, ensuring consistency, reliability, and adherence to industry best practices
  • Ensure reliability, observability, and scalability of production ML and genAI workloads by implementing comprehensive monitoring, alerting, and continuous performance evaluation
  • Integrate infrastructure components such as real-time model serving frameworks (e.g., TensorFlow Serving, NVIDIA Triton, Seldon), Kubernetes orchestration, and cloud solutions (AWS/GCP) for robust production environments
  • Drive infrastructure optimization for generative AI use-cases, including efficient inference techniques (batching, caching, quantization), fine-tuning, prompt management, and model updates at scale
  • Partner with data engineering, product, infrastructure, and genAI teams to align ML platform initiatives with broader company goals, infrastructure strategy, and innovation roadmap
  • Contribute actively to internal documentation, onboarding, and training programs, promoting platform adoption and continuous improvement

Requirements

  • Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads
  • Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch), infrastructure tooling (MLflow, Kubeflow, Ray), and popular APIs (Hugging Face, OpenAI, LangChain)
  • Experience implementing modern MLOps practices, including model lifecycle management, CI/CD, Docker, Kubernetes, model registries, and infrastructure-as-code tools (Terraform, Helm)
  • Demonstrated experience working with cloud infrastructure, ideally AWS or GCP, including Kubernetes clusters (GKE/EKS), serverless architectures, and managed ML services (e.g., Vertex AI, SageMaker)
  • Proven experience with generative AI technologies: transformers, embeddings, prompt engineering strategies, fine-tuning vs. prompt-tuning, vector databases, and retrieval-augmented generation (RAG) systems
  • Experience designing and maintaining real-time inference pipelines, including integrations with feature stores, streaming data platforms (Kafka, Kinesis), and observability platforms
  • Familiarity with SQL and data warehouse modeling
  • capable of managing complex data queries, joins, aggregations, and transformations
  • Solid understanding of ML monitoring, including identifying model drift, decay, latency optimization, cost management, and scaling API-based genAI applications efficiently
  • Bachelor’s degree in Computer Science, Engineering, or a related field
  • advanced degree is a plus
  • 3+ years of experience in ML platform engineering, ML infrastructure, generative AI, or closely related roles
  • Proven track record of successfully building and operating ML infrastructure at scale, ideally supporting generative AI use-cases and complex inference scenarios
  • Strategic mindset with strong problem-solving skills and effective technical decision-making abilities
  • Excellent communication and collaboration skills, comfortable working cross-functionally across diverse teams and stakeholders
  • Strong sense of ownership, accountability, pragmatism, and proactive bias for action

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Sr. Engineer, ML Platform

8 matching positions

Sr. Staff ML Platform Engineer

Machine learning is the crucial enabler for every financial service that EarnIn ...
Location
Location
United States , Mountain View
Salary
Salary:
360000.00 - 440000.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field
  • 8+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms such as AWS Sagemaker, Databricks, or GCP Vertex AI
  • Familiarity with data pipelines and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest industry trends in machine learning and platform engineering
Job Responsibility
Job Responsibility
  • Design, build, and maintain a robust ML platform and tooling ecosystem that supports the entire machine learning lifecycle, from experimentation to production
  • Lead and mentor a team of ML engineers, deeply understanding their workflows to streamline model training, deployment, and monitoring, while ensuring reproducibility and consistency of results
  • Drive scalability, reliability, and cost efficiency of the ML platform, balancing performance with ease of use for scientists and engineers
  • Evaluate and adopt emerging technologies to continually advance the organization’s machine learning capabilities and maintain a competitive edge
  • Champion operational excellence, setting a high bar for engineering quality, reliability, and automation
  • Act as a catalyst for innovation, spearheading step-change improvements that unlock new opportunities for growth and efficiency
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Sr Software Engineer - Matching ML Platform

Uber is looking for a Software Engineer to join our Matching ML Platform team. T...
Location
Location
United States , Seattle, Washington; San Francisco, California
Salary
Salary:
202000.00 - 224000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years experience working on the full software life cycle including gathering requirements, project planning, solution design, coding/implementation, testing, rollout/deployment and best practices as an individual contributor
  • Experience with ML in production systems
  • Experience coding using general purpose programming language (eg. C/C++, Java, Python, Go, C#)
  • Fast and passionate learner
  • Strong collaboration, documentation and communication skills
Job Responsibility
Job Responsibility
  • Build and scale a low-latency platform powering millions of real-time match decisions per second
  • Identify opportunities to improve various ML system's performance and health
  • Design modular systems that accelerate product innovation without rework
  • Optimize for fairness, efficiency, and marketplace health at global scale
  • Collaborate across product, infra, and ML teams to deliver business-critical impact
What we offer
What we offer
  • Bonus program
  • Equity award & other types of compensation
  • 401(k) plan
  • Various benefits
  • Fulltime
Read More
Arrow Right

Sr Staff ML Engineer - Production & MLOps Focus - GenAI Security Platform

Join our team building a cutting-edge multi-tenanted GenAI Security Platform tha...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of ML engineering experience with hands-on LLM/NLP work
  • Practical experience building LLM-based applications (agents, multi-turn systems, evaluators)
  • Understanding of model fine-tuning, embedding optimization, and prompt engineering
  • Experience with LLM APIs (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI)
  • Knowledge of LLM orchestration frameworks ( LangChain, LlamaIndex, Pydantic AI, custom solutions)
  • Familiarity with model architectures and when to fine-tune vs prompt engineer
  • Strong experience deploying ML models to production at scale
  • Experience with Model serving frameworks (vLLM preferred
  • TensorRT-LLM, Ray Serve, or similar a plus)
  • Kubernetes and Docker proficiency for ML workload orchestration
Job Responsibility
Job Responsibility
  • Build and deploy LLM-based agents and multi-step evaluation workflows
  • Fine-tune models, optimize embeddings, and manage model weights and artifacts
  • Deploy and scale ML services on Kubernetes with proper monitoring and resource management
  • Implement experiment tracking, model versioning, and deployment automation
  • Develop observability dashboards for ML metrics, costs, latency, and quality
  • Optimize LLM API usage through caching, batching, and intelligent routing strategies
  • Manage vector database infrastructure and semantic search systems
  • Create CI/CD pipelines for ML artifacts and automated testing frameworks
  • Collaborate with ML researchers to productionize prototypes and scale experiments
  • Fulltime
Read More
Arrow Right

ML Engineer Sr

This job description indicates the general nature and level of work expected of ...
Location
Location
United States
Salary
Salary:
59.00 - 88.50 USD / Hour
advocatehealth.com Logo
Advocate Health Care
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, data science, mathematics, statistics, or other related field requiring advanced analytics
  • 5 years in deploying, monitoring, and iterating upon machine learning models in production
  • Strong analytical thinker, able discern business needs
  • Advanced proficiency in Python code writing
  • Extensive knowledge of ML libraries, frameworks, and data structures
  • Proficiency in SQL or other database language
  • Experience with version control systems such as Git, and ML packaging solutions such as Docker
  • Demonstrated self-directed, results oriented and creative approach to problem identification and solving
  • Demonstrated ability to work independently with little supervision
Job Responsibility
Job Responsibility
  • Transform data science prototypes to production quality tools
  • Ensure that machine learning (ML) models generate accurate results for end users
  • Assist with managing ML software and platforms used for computing and model deployment
  • Run tests on ML models and interpret the results
  • Use those results to improve the models as needed
  • Identify changes in data inputs that can affect model performance
  • Communicates effectively with both internal and external clients, explaining highly technical methods and processes to audiences of varying technical backgrounds
  • Continuously studying and researching new ML tools and technologies
  • Participates in evaluation of vendor artificial intelligence solutions, acting as the data science subject matter expert on behalf of the enterprise
What we offer
What we offer
  • Paid Time Off programs
  • Health and welfare benefits such as medical, dental, vision, life, and Short- and Long-Term Disability
  • Flexible Spending Accounts for eligible health care and dependent care expenses
  • Family benefits such as adoption assistance and paid parental leave
  • Defined contribution retirement plans with employer match and other financial wellness programs
  • Educational Assistance Program
  • Fulltime
Read More
Arrow Right

Sr Staff ML Engineer - Applied AI

We are building AI-native discovery experiences across Mobility and Delivery. Se...
Location
Location
United States , San Francisco
Salary
Salary:
267000.00 - 297000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Masters degree or Ph.D in Computer Science, Engineering, Mathematics
  • 12+ years of ML experience, including significant work on large-scale deep learning systems
  • Demonstrated ownership of high-impact ML systems in search, recommendations, or conversational AI
  • Deep expertise in transformers, retrieval systems, ranking, and embedding architectures
  • Strong experience with PyTorch and distributed training
  • Proven ability to set the technical strategy for a large organization and influence product roadmaps at the executive level
  • Strong product intuition and ability to connect model improvements to business outcomes
Job Responsibility
Job Responsibility
  • Define and champion the multi-year technical vision and architecture for foundation models across Search, Recommendations, and Conversational AI
  • Set the architectural standard and drive system design for critical, high-leverage ML platforms across Mobility and Delivery
  • Lead cross-team initiatives spanning Retrieval, Ranking, Personalization, and LLM-powered assistants, resolving complex technical trade-offs across organizational boundaries
  • Define long-term investment areas (build vs fine-tune vs partner models) with clear business rationale and long-term viability
  • Provide principal-level technical leadership, mentoring Staff and Senior Staff engineers, and setting the bar for technical excellence across the entire AI organization
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right

Sr. Distinguished AI Engineer (Agentic AI Platform)

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , San Jose, California; San Francisco, California
Salary
Salary:
343400.00 - 392000.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or AI plus at least 10 years of experience developing AI and ML algorithms or technologies, or Master's degree plus at least 8 years of experience developing AI and ML algorithms or technologies
  • At least 10 years of experience programming with Python, Go, Scala, or Java
  • 9 years of experience deploying scalable and responsible AI solutions on cloud platforms
  • 2+ years of experience supporting Agentic Frameworks
  • 2+ years of experience with LLMOps
  • 8+ years of experience designing mission-critical machine learning platforms
  • 2+ years of experience architecting, designing, developing, integrating, delivering, and supporting complex AI systems
  • Demonstrated ability to lead and mentor multiple engineering teams and influence cross-functional stakeholders up to the VP level
  • Experience developing AI and ML algorithms or technologies using Python, C++, C#, Java, or Golang
  • Master's degree in Computer Science, Computer Engineering, or relevant technical field
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products
  • Contribute to the north star platform architecture, continuously publishing and refining living diagrams and canonical APIs
  • Standardizing and automating agentic workflows
  • Contribute to crafting an end to end GenAI SDK, CLI and starter kits
  • Help bring together a vision of central guardrail services
  • Collaborate with cross organization architects to drive end to end performance
  • Accelerate innovation by incubating proof of concepts and driving RFCs
  • Own central Helm charts, operators and CRDs that auto scale agents to hit tenant SLAs
  • Coach and evangelize - hosting architecture office hours, mentoring Staff, Principal and Senior engineers, authoring technical design documents and blogs and representing Capital One at Tier1 AI conferences
What we offer
What we offer
  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • comprehensive, competitive, and inclusive set of health, financial and other benefits
  • Fulltime
Read More
Arrow Right

Sr. Lead AI Engineer (Inference Optimization, FM hosting, AI Platform)

At Capital One, we are creating responsible and reliable AI systems, changing ba...
Location
Location
United States , San Jose, California; San Francisco, California; New York, New York; Cambridge, Massachusetts; McLean, Virginia
Salary
Salary:
229900.00 - 286200.00 USD / Year
capitalone.com Logo
Capital One
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 6 years of experience developing AI and ML algorithms or technologies, or a Master's degree in Computer Science, AI, Electrical Engineering, Computer Engineering, or related fields plus at least 4 years of experience developing AI and ML algorithms or technologies
  • At least 6 years of experience programming with Python, Go, Scala, or Java
Job Responsibility
Job Responsibility
  • Partner with a cross-functional team of engineers, research scientists, technical program managers, and product managers to deliver AI-powered products that change how our associates work and how our customers interact with Capital One
  • Design, develop, test, deploy, and support AI software components including foundation model training, large language model inference, similarity search, guardrails, model evaluation, experimentation, governance, and observability, etc.
  • Leverage a broad stack of Open Source and SaaS AI technologies such as AWS Ultraclusters, Huggingface, VectorDBs, Nemo Guardrails, PyTorch, and more
  • Invent and introduce state-of-the-art LLM optimization techniques to improve the performance — scalability, cost, latency, throughput — of large scale production AI systems
  • Contribute to the technical vision and the long term roadmap of foundational AI systems at Capital One
What we offer
What we offer
  • Cash bonus(es)
  • Long term incentives (LTI)
  • Comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being
  • Fulltime
Read More
Arrow Right

Sr Engineer, Machine Learning Engineering

The Senior Engineer, Machine Learning plays a pivotal role in advancing AI capab...
Location
Location
United States , Bellevue; Atlanta; Overland Park; Herndon
Salary
Salary:
127000.00 - 229100.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree Computer Science, Data Science, Statistics, Informatics, Information Systems, Machine Learning, or another quantitative field
  • 1+ year of experience in designing, developing, and deploying large language models (LLMs) and generative AI systems in production environments
  • 5+ years of experience building and maintaining end-to-end ML pipelines, including data ingestion, training, deployment, monitoring, and optimization
  • 3+ years of experience applying MLOps practices and leveraging cloud platforms (AWS, GCP, or Azure) for scalable AI solutions
  • 5+ years of experience collaborating with cross-functional teams (engineering, data science, and product) to deliver AI-powered applications
  • 2+ years of experience in programming languages such as Python/R, Java/Scala, and/or Go, with hands-on experience in frameworks such as PyTorch, TensorFlow, LangChain, or Hugging Face
  • At least 18 years of age
  • Legally authorized to work in the United States
Job Responsibility
Job Responsibility
  • Build and manage the complete machine learning and generative AI lifecycle, including research, design, experimentation, development, deployment, monitoring, and maintenance
  • Design, develop, and deploy LLM-based and generative AI models to power scalable and intelligent enterprise applications
  • Architect, optimize, and maintain retrieval-augmented generation (RAG), prompt orchestration, and contextual reasoning pipelines to support diverse AI use cases
  • Implement scalable MLOps pipelines for model deployment, performance monitoring, and continuous improvement
  • Conduct fine-tuning, alignment, and evaluation of LLMs and multimodal models to ensure reliability, efficiency, and fairness
  • Collaborate with data science, engineering, and product teams to translate business needs into generative AI-driven solutions
  • Perform benchmarking, evaluation, and optimization of generative models to improve accuracy, latency, and cost efficiency
  • Research and apply emerging techniques in transformer architectures, multimodal learning, and generative modeling to drive innovation and enhance enterprise capabilities
  • Ensure secure, ethical, and responsible AI deployment, embedding fairness, transparency, and compliance throughout the model lifecycle
  • Mentor and guide team members on generative AI frameworks, best practices, and experimentation methodologies
What we offer
What we offer
  • annual stock grant
  • employee stock purchase plan
  • 401(k)
  • free, year-round money coaches
  • medical insurance
  • dental insurance
  • vision insurance
  • flexible spending account
  • paid time off
  • up to 12 paid holidays
  • Fulltime
Read More
Arrow Right