CrawlJobs Logo

Sr. Engineer, ML Platform

deliveryhero.com Logo

Delivery Hero

Location Icon

Location:
United Kingdom , London

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As the leading delivery platform in the region, we have a unique responsibility and opportunity to positively impact millions of customers, restaurant partners, and riders. To achieve our mission, we must scale and continuously evolve our machine learning capabilities, including cutting-edge Generative AI (genAI) initiatives. This demands robust, efficient, and scalable ML platforms that empower our teams to rapidly develop, deploy, and operate intelligent systems. As an ML Platform Engineer, your mission is to design, build, and enhance the infrastructure and tooling that accelerates the development, deployment, and monitoring of traditional ML and genAI models at scale. You’ll collaborate closely with data scientists, ML engineers, genAI specialists, and product teams to deliver seamless ML workflows—from experimentation to production serving—ensuring operational excellence across our ML and genAI systems.

Job Responsibility:

  • Design, build, and maintain scalable, reusable, and reliable ML platforms and tooling that support the entire ML lifecycle, including data ingestion, model training, evaluation, deployment, and monitoring for both traditional and generative AI models
  • Develop standardized ML workflows and templates using MLflow and other platforms, enabling rapid experimentation and deployment cycles
  • Implement robust CI/CD pipelines, Docker containerization, model registries, and experiment tracking to support reproducibility, scalability, and governance in ML and genAI
  • Collaborate closely with genAI experts to integrate and optimize genAI technologies, including transformers, embeddings, vector databases (e.g., Pinecone, Redis, Weaviate), and real-time retrieval-augmented generation (RAG) systems
  • Automate and streamline ML and genAI model training, inference, deployment, and versioning workflows, ensuring consistency, reliability, and adherence to industry best practices
  • Ensure reliability, observability, and scalability of production ML and genAI workloads by implementing comprehensive monitoring, alerting, and continuous performance evaluation
  • Integrate infrastructure components such as real-time model serving frameworks (e.g., TensorFlow Serving, NVIDIA Triton, Seldon), Kubernetes orchestration, and cloud solutions (AWS/GCP) for robust production environments
  • Drive infrastructure optimization for generative AI use-cases, including efficient inference techniques (batching, caching, quantization), fine-tuning, prompt management, and model updates at scale
  • Partner with data engineering, product, infrastructure, and genAI teams to align ML platform initiatives with broader company goals, infrastructure strategy, and innovation roadmap
  • Contribute actively to internal documentation, onboarding, and training programs, promoting platform adoption and continuous improvement

Requirements:

  • Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads
  • Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch), infrastructure tooling (MLflow, Kubeflow, Ray), and popular APIs (Hugging Face, OpenAI, LangChain)
  • Experience implementing modern MLOps practices, including model lifecycle management, CI/CD, Docker, Kubernetes, model registries, and infrastructure-as-code tools (Terraform, Helm)
  • Demonstrated experience working with cloud infrastructure, ideally AWS or GCP, including Kubernetes clusters (GKE/EKS), serverless architectures, and managed ML services (e.g., Vertex AI, SageMaker)
  • Proven experience with generative AI technologies: transformers, embeddings, prompt engineering strategies, fine-tuning vs. prompt-tuning, vector databases, and retrieval-augmented generation (RAG) systems
  • Experience designing and maintaining real-time inference pipelines, including integrations with feature stores, streaming data platforms (Kafka, Kinesis), and observability platforms
  • Familiarity with SQL and data warehouse modeling
  • capable of managing complex data queries, joins, aggregations, and transformations
  • Solid understanding of ML monitoring, including identifying model drift, decay, latency optimization, cost management, and scaling API-based genAI applications efficiently
  • Bachelor’s degree in Computer Science, Engineering, or a related field
  • advanced degree is a plus
  • 3+ years of experience in ML platform engineering, ML infrastructure, generative AI, or closely related roles
  • Proven track record of successfully building and operating ML infrastructure at scale, ideally supporting generative AI use-cases and complex inference scenarios
  • Strategic mindset with strong problem-solving skills and effective technical decision-making abilities
  • Excellent communication and collaboration skills, comfortable working cross-functionally across diverse teams and stakeholders
  • Strong sense of ownership, accountability, pragmatism, and proactive bias for action

Additional Information:

Job Posted:
January 05, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. Engineer, ML Platform

Sr. Staff ML Platform Engineer

Machine learning is the crucial enabler for every financial service that EarnIn ...
Location
Location
United States , Mountain View
Salary
Salary:
360000.00 - 440000.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field
  • 8+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms such as AWS Sagemaker, Databricks, or GCP Vertex AI
  • Familiarity with data pipelines and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest industry trends in machine learning and platform engineering
Job Responsibility
Job Responsibility
  • Design, build, and maintain a robust ML platform and tooling ecosystem that supports the entire machine learning lifecycle, from experimentation to production
  • Lead and mentor a team of ML engineers, deeply understanding their workflows to streamline model training, deployment, and monitoring, while ensuring reproducibility and consistency of results
  • Drive scalability, reliability, and cost efficiency of the ML platform, balancing performance with ease of use for scientists and engineers
  • Evaluate and adopt emerging technologies to continually advance the organization’s machine learning capabilities and maintain a competitive edge
  • Champion operational excellence, setting a high bar for engineering quality, reliability, and automation
  • Act as a catalyst for innovation, spearheading step-change improvements that unlock new opportunities for growth and efficiency
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Sr. Principal Software Engineer – Search & Recommendation

We are seeking a Sr. Principal Search & Recommendation Engineer to lead the desi...
Location
Location
United States , Seattle
Salary
Salary:
277391.00 - 342391.00 USD / Year
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience building and scaling search or recommendation systems in production environments
  • Deep expertise in information retrieval, ranking algorithms, collaborative filtering, and/or neural search techniques
  • Strong programming skills in Python, Java, or Scala
  • experience with ML and IR frameworks such as Elasticsearch, FAISS, TensorFlow, or PyTorch
  • Familiarity with LLMs, embeddings, and modern vector search infrastructure
  • Proven leadership in cross-functional environments with a track record of mentoring and guiding technical teams
  • Strong grasp of MLOps practices and experience with cloud-native ML infrastructure (e.g., AWS, GCP)
Job Responsibility
Job Responsibility
  • Lead the end-to-end development of modern search and recommendation systems, from architecture to production deployment
  • Drive technical strategy and innovation in search relevance, personalized ranking, semantic search, and ML-powered retrieval/grounding
  • Collaborate with product, design, and data teams to define and deliver intelligent user experiences
  • Influence platform-level decisions on data pipelines, experimentation frameworks, and performance optimization
  • Mentor engineers, foster technical excellence, and promote a culture of learning and innovation
What we offer
What we offer
  • Comprehensive medical, dental, vision, disability, and life benefits
  • Health Savings Account (HSA) with employer contribution
  • 401(k) Matching with immediate vesting on employer match
  • Flexible PTO
  • 8 paid holidays and 5 paid days for Annual Holiday Week
  • Quarterly Recharge Fridays (paid days off for mental health recharge)
  • 18 weeks paid parental leave
  • Access to Coaches and Therapists through Modern Health
  • 2 volunteer days per year
  • Commuting benefits
  • Fulltime
Read More
Arrow Right

Sr Staff Machine Learning Engineer - Ads

Ads is a growing business at Uber. As part of this team, you will get an opportu...
Location
Location
United States , San Francisco; Sunnyvale
Salary
Salary:
267000.00 - 297000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Technical leadership and drive technical direction for ads delivery area
  • Lead the design and implementation of advanced ML systems for all ads products at Uber
  • Own end-to-end ML model lifecycle from research through production deployment and continuous optimization
  • Build scalable ML architecture and feature management systems supporting ads marketplace
  • Establish ML engineering best practices, monitoring, and operational excellence across the organization
  • Create platform abstractions that enable other ML engineers to iterate faster on improvements
  • Collaborate with ads marketplace product and science teams to productionize cutting-edge ML research
  • Work with platform engineering teams to ensure ML systems meet reliability and performance standards
  • Influence technical roadmaps across multiple teams through technical leadership and strategic thinking
  • Mentor and grow senior ML engineers, establishing technical standards and engineering culture
Job Responsibility
Job Responsibility
  • Provide technical leadership, and drive technical direction for the ads delivery area
  • Work with cross-functional teams to find opportunities and implement enhancements to improve consumer experience and unlock value for advertisers and Uber
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right

Sr Staff ML Engineer - Production & MLOps Focus - GenAI Security Platform

Join our team building a cutting-edge multi-tenanted GenAI Security Platform tha...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of ML engineering experience with hands-on LLM/NLP work
  • Practical experience building LLM-based applications (agents, multi-turn systems, evaluators)
  • Understanding of model fine-tuning, embedding optimization, and prompt engineering
  • Experience with LLM APIs (OpenAI, Anthropic, AWS Bedrock, Azure OpenAI)
  • Knowledge of LLM orchestration frameworks ( LangChain, LlamaIndex, Pydantic AI, custom solutions)
  • Familiarity with model architectures and when to fine-tune vs prompt engineer
  • Strong experience deploying ML models to production at scale
  • Experience with Model serving frameworks (vLLM preferred
  • TensorRT-LLM, Ray Serve, or similar a plus)
  • Kubernetes and Docker proficiency for ML workload orchestration
Job Responsibility
Job Responsibility
  • Build and deploy LLM-based agents and multi-step evaluation workflows
  • Fine-tune models, optimize embeddings, and manage model weights and artifacts
  • Deploy and scale ML services on Kubernetes with proper monitoring and resource management
  • Implement experiment tracking, model versioning, and deployment automation
  • Develop observability dashboards for ML metrics, costs, latency, and quality
  • Optimize LLM API usage through caching, batching, and intelligent routing strategies
  • Manage vector database infrastructure and semantic search systems
  • Create CI/CD pipelines for ML artifacts and automated testing frameworks
  • Collaborate with ML researchers to productionize prototypes and scale experiments
  • Fulltime
Read More
Arrow Right

Sr. Software Engineer (Agentic Runtime)

Dialpad’s AI Engineering organization is responsible for building and maintainin...
Location
Location
Argentina , Buenos Aires
Salary
Salary:
Not provided
dialpad.com Logo
Dialpad
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3–6 years of experience in distributed systems, platform engineering, or ML infrastructure, with exposure to LLM-based or agentic systems strongly preferred
  • Strong understanding of agent architectures, including ReAct, plan-and-execute, and multi-agent coordination patterns
  • Deep knowledge of context management, prompt lifecycle, tool-call protocols (e.g., function calling, MCP), and agent memory strategies (short-term, episodic, and long-term)
  • Experience integrating and managing external tool ecosystems, including web search, code interpreters, databases, and third-party APIs
  • Familiarity with retrieval-augmented generation (RAG) and how retrieval fits into broader agentic pipelines
  • Understanding of LLM output reliability challenges — hallucination, non-determinism, and retry/fallback strategies at runtime
  • Proficiency in Go and Python 3 (experience with Rust or TypeScript is a plus)
  • Strong understanding of distributed systems, microservices, and event-driven architectures suited to long-running agent tasks
  • Passion for real-time performance optimization, including streaming responses, async execution, and parallel tool invocation
  • Experience with API design using OpenAPI, Swagger, or equivalent, with an eye toward agentic interaction patterns
Job Responsibility
Job Responsibility
  • Contribute to the design, development, and maintenance of agentic runtime systems, including agent orchestration, tool execution pipelines, and multi-step reasoning loops
  • Build and optimize core runtime components, including task planners, action dispatchers, memory managers, and context window management systems
  • Work on agent coordination techniques, including dynamic tool selection, parallel agent execution, state management, and result aggregation across multi-agent workflows
  • Maintain and enhance highly scalable agentic platforms with a focus on low-latency execution, cost efficiency, and deterministic behavior
  • Ensure high availability, reliability, and fault tolerance in agent runtime services, including graceful degradation when LLM or tool calls fail
  • Collaborate with cross-functional teams — including ML researchers, product, and platform engineers — to translate agentic product requirements into robust runtime infrastructure
  • Develop and optimize real-time distributed systems, microservices, and event-driven architectures powering agentic task execution
  • Design and implement sandboxed execution environments for safe agent use of tools, code execution, and external API calls
  • Implement and maintain monitoring, alerting, and performance metrics covering agent run success rates, token consumption, latency, and cost attribution
  • Evaluate and integrate emerging agentic frameworks, LLM APIs, and tooling ecosystems to continuously improve platform capabilities
What we offer
What we offer
  • Competitive benefits and perks
  • Robust training program
  • Inclusive office environment
  • Recognized Great Place to Work culture
Read More
Arrow Right

Sr. Machine Learning Engineer – Context Engineering

GEICO is seeking an experienced Sr. Staff Machine Learning Engineer to join our ...
Location
Location
United States , New York City; Palo Alto; Chevy Chase
Salary
Salary:
115000.00 - 230000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience designing and building AIML platform and systems utilizing components such as vectordb (e.g. Qdrant, Milvus, etc.), data warehouse (e.g. snowflake), streaming platform (e.g. Kafka), relational database (e.g. postgres sql), knowledge graph (e.g. neo4j), workflow orchestration (e.g. Airflow, Temporal)
  • Proficient in Python, Java and similar general-purpose programming languages
  • 3+ years’ experience managing end-to-end software development life cycle (e.g. CICD pipelines, Kubernetes-based deployments, testing, monitoring & alerting, production support etc.) for backend systems and APIs
  • 2+ years’ experience building training, finetuning, real-time/batch inferencing and evaluation systems for AIML models and LLMs, esp. utilizing GPU-powered infrastructure
  • Bachelor’s degree or above in Computer Science, Engineering, Statistics or a related field
Job Responsibility
Job Responsibility
  • Own development of key platform components that power end-to-end GenAI agentic workflows. Examples include knowledge curation & management, search, context management, workflow orchestration, etc.
  • Collaborate with cross-functional teams, including data scientists, ML engineers, software engineers, product managers, designers to gather requirements, define project scope and prioritize feature backlogs for high impact business use cases. Establish pragmatic visions & roadmaps that balance business outcome, product release timelines and engineering excellence
  • Contribute to the selection, evaluation, and implementation of software technologies, tools, and frameworks, balancing build vs. buy, speed to market, maintainability, etc.
  • Lead a small team of engineers for feature & system implementation. Troubleshoot and resolve complex software issues, ensuring optimal platform performance and reliability
  • Mentor and guide junior engineers via code reviews and design sessions, fostering a collaborative and high-performance team culture, elevating AI engineering best practices across the company
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Sr Staff ML Engineer - Applied AI

We are building AI-native discovery experiences across Mobility and Delivery. Se...
Location
Location
United States , San Francisco
Salary
Salary:
267000.00 - 297000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Masters degree or Ph.D in Computer Science, Engineering, Mathematics
  • 12+ years of ML experience, including significant work on large-scale deep learning systems
  • Demonstrated ownership of high-impact ML systems in search, recommendations, or conversational AI
  • Deep expertise in transformers, retrieval systems, ranking, and embedding architectures
  • Strong experience with PyTorch and distributed training
  • Proven ability to set the technical strategy for a large organization and influence product roadmaps at the executive level
  • Strong product intuition and ability to connect model improvements to business outcomes
Job Responsibility
Job Responsibility
  • Define and champion the multi-year technical vision and architecture for foundation models across Search, Recommendations, and Conversational AI
  • Set the architectural standard and drive system design for critical, high-leverage ML platforms across Mobility and Delivery
  • Lead cross-team initiatives spanning Retrieval, Ranking, Personalization, and LLM-powered assistants, resolving complex technical trade-offs across organizational boundaries
  • Define long-term investment areas (build vs fine-tune vs partner models) with clear business rationale and long-term viability
  • Provide principal-level technical leadership, mentoring Staff and Senior Staff engineers, and setting the bar for technical excellence across the entire AI organization
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right

Sr Program Manager Tech - Gen AI

Lead program delivery and client engagement in the domain of AI training and eva...
Location
Location
United States , Sunnyvale; San Francisco; New York
Salary
Salary:
167000.00 - 185500.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of overall experience, with specific familiarity in software engineering, ML engineering, ML ops domains
  • Familiarity and experience in leading or managing client interactions (i.e., AI labs, foundation LLM companies, agentic AI companies) for data annotation, training, evaluation, performance benchmarking in the area of coding and development for foundational AI/LLM/ML is required
  • Experience in client facing service delivery management, solutioning, governance - with external client stakeholders at senior levels and/or their AI teams
  • Familiarity with strategies for delivery and QC processes in this domain is required
  • Track record of driving innovation and thought leadership in AI/ML/LLM training and evaluation services
  • Strong ability to communicate, bring clarity of thought in messaging for senior management as well as broader teams
  • Strong collaboration skills and abilities - working across silos and team structures to drive impact effectively
  • Ability to work in a global organization across locations and time zones
Job Responsibility
Job Responsibility
  • Client engagement for presales support - partner with Sales to interact with prospective clients to shape the project scope, evangelise our capabilities, design the delivery solution, and governance approach
  • Client engagement for program delivery - represent the service delivery organization and collaborate with them in order to drive ongoing governance, enable troubleshooting, find up/cross sell opportunities, bring thought leadership with client teams
  • Program delivery - help to manage US/onshore based delivery of annotation / training/ evaluation of AI/LLM/ML for coding and data areas, where required
  • Innovation and thought leadership - demonstrate deep understanding and expertise of coding and data analytics related AI training/evals including agentic AI with prospective clients
  • Sourcing strategy and implementation inputs - collaborate with our Supply team to help source and develop worker pools in the US/onshore with technical expertise for coding and data related training/evals
  • Tech platform capability and roadmap inputs - collaborate with our Product and Engineering teams to help develop a roadmap for tech and tooling required specific to coding and data analytics related tasking
  • Stakeholder management - represent the coding and data AI capabilities at senior leadership level interactions and forums, evangelise our capabilities, drive sponsorship and backing for initiatives
  • Best practices - continually improve ways of work, enhance delivery maturity, elevate governance and impact
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right