CrawlJobs Logo

Sr. Engineer, ML Platform

deliveryhero.com Logo

Delivery Hero

Location Icon

Location:
United Kingdom , London

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As the leading delivery platform in the region, we have a unique responsibility and opportunity to positively impact millions of customers, restaurant partners, and riders. To achieve our mission, we must scale and continuously evolve our machine learning capabilities, including cutting-edge Generative AI (genAI) initiatives. This demands robust, efficient, and scalable ML platforms that empower our teams to rapidly develop, deploy, and operate intelligent systems. As an ML Platform Engineer, your mission is to design, build, and enhance the infrastructure and tooling that accelerates the development, deployment, and monitoring of traditional ML and genAI models at scale. You’ll collaborate closely with data scientists, ML engineers, genAI specialists, and product teams to deliver seamless ML workflows—from experimentation to production serving—ensuring operational excellence across our ML and genAI systems.

Job Responsibility:

  • Design, build, and maintain scalable, reusable, and reliable ML platforms and tooling that support the entire ML lifecycle, including data ingestion, model training, evaluation, deployment, and monitoring for both traditional and generative AI models
  • Develop standardized ML workflows and templates using MLflow and other platforms, enabling rapid experimentation and deployment cycles
  • Implement robust CI/CD pipelines, Docker containerization, model registries, and experiment tracking to support reproducibility, scalability, and governance in ML and genAI
  • Collaborate closely with genAI experts to integrate and optimize genAI technologies, including transformers, embeddings, vector databases (e.g., Pinecone, Redis, Weaviate), and real-time retrieval-augmented generation (RAG) systems
  • Automate and streamline ML and genAI model training, inference, deployment, and versioning workflows, ensuring consistency, reliability, and adherence to industry best practices
  • Ensure reliability, observability, and scalability of production ML and genAI workloads by implementing comprehensive monitoring, alerting, and continuous performance evaluation
  • Integrate infrastructure components such as real-time model serving frameworks (e.g., TensorFlow Serving, NVIDIA Triton, Seldon), Kubernetes orchestration, and cloud solutions (AWS/GCP) for robust production environments
  • Drive infrastructure optimization for generative AI use-cases, including efficient inference techniques (batching, caching, quantization), fine-tuning, prompt management, and model updates at scale
  • Partner with data engineering, product, infrastructure, and genAI teams to align ML platform initiatives with broader company goals, infrastructure strategy, and innovation roadmap
  • Contribute actively to internal documentation, onboarding, and training programs, promoting platform adoption and continuous improvement

Requirements:

  • Strong software engineering background with experience in building distributed systems or platforms designed for machine learning and AI workloads
  • Expert-level proficiency in Python and familiarity with ML frameworks (TensorFlow, PyTorch), infrastructure tooling (MLflow, Kubeflow, Ray), and popular APIs (Hugging Face, OpenAI, LangChain)
  • Experience implementing modern MLOps practices, including model lifecycle management, CI/CD, Docker, Kubernetes, model registries, and infrastructure-as-code tools (Terraform, Helm)
  • Demonstrated experience working with cloud infrastructure, ideally AWS or GCP, including Kubernetes clusters (GKE/EKS), serverless architectures, and managed ML services (e.g., Vertex AI, SageMaker)
  • Proven experience with generative AI technologies: transformers, embeddings, prompt engineering strategies, fine-tuning vs. prompt-tuning, vector databases, and retrieval-augmented generation (RAG) systems
  • Experience designing and maintaining real-time inference pipelines, including integrations with feature stores, streaming data platforms (Kafka, Kinesis), and observability platforms
  • Familiarity with SQL and data warehouse modeling
  • capable of managing complex data queries, joins, aggregations, and transformations
  • Solid understanding of ML monitoring, including identifying model drift, decay, latency optimization, cost management, and scaling API-based genAI applications efficiently
  • Bachelor’s degree in Computer Science, Engineering, or a related field
  • advanced degree is a plus
  • 3+ years of experience in ML platform engineering, ML infrastructure, generative AI, or closely related roles
  • Proven track record of successfully building and operating ML infrastructure at scale, ideally supporting generative AI use-cases and complex inference scenarios
  • Strategic mindset with strong problem-solving skills and effective technical decision-making abilities
  • Excellent communication and collaboration skills, comfortable working cross-functionally across diverse teams and stakeholders
  • Strong sense of ownership, accountability, pragmatism, and proactive bias for action

Additional Information:

Job Posted:
January 05, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr. Engineer, ML Platform

Sr. Staff ML Platform Engineer

Machine learning is the crucial enabler for every financial service that EarnIn ...
Location
Location
United States , Mountain View
Salary
Salary:
360000.00 - 440000.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field
  • 8+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms such as AWS Sagemaker, Databricks, or GCP Vertex AI
  • Familiarity with data pipelines and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest industry trends in machine learning and platform engineering
Job Responsibility
Job Responsibility
  • Design, build, and maintain a robust ML platform and tooling ecosystem that supports the entire machine learning lifecycle, from experimentation to production
  • Lead and mentor a team of ML engineers, deeply understanding their workflows to streamline model training, deployment, and monitoring, while ensuring reproducibility and consistency of results
  • Drive scalability, reliability, and cost efficiency of the ML platform, balancing performance with ease of use for scientists and engineers
  • Evaluate and adopt emerging technologies to continually advance the organization’s machine learning capabilities and maintain a competitive edge
  • Champion operational excellence, setting a high bar for engineering quality, reliability, and automation
  • Act as a catalyst for innovation, spearheading step-change improvements that unlock new opportunities for growth and efficiency
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Sr. Principal Software Engineer – Search & Recommendation

We are seeking a Sr. Principal Search & Recommendation Engineer to lead the desi...
Location
Location
United States , Seattle
Salary
Salary:
277391.00 - 342391.00 USD / Year
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience building and scaling search or recommendation systems in production environments
  • Deep expertise in information retrieval, ranking algorithms, collaborative filtering, and/or neural search techniques
  • Strong programming skills in Python, Java, or Scala
  • experience with ML and IR frameworks such as Elasticsearch, FAISS, TensorFlow, or PyTorch
  • Familiarity with LLMs, embeddings, and modern vector search infrastructure
  • Proven leadership in cross-functional environments with a track record of mentoring and guiding technical teams
  • Strong grasp of MLOps practices and experience with cloud-native ML infrastructure (e.g., AWS, GCP)
Job Responsibility
Job Responsibility
  • Lead the end-to-end development of modern search and recommendation systems, from architecture to production deployment
  • Drive technical strategy and innovation in search relevance, personalized ranking, semantic search, and ML-powered retrieval/grounding
  • Collaborate with product, design, and data teams to define and deliver intelligent user experiences
  • Influence platform-level decisions on data pipelines, experimentation frameworks, and performance optimization
  • Mentor engineers, foster technical excellence, and promote a culture of learning and innovation
What we offer
What we offer
  • Comprehensive medical, dental, vision, disability, and life benefits
  • Health Savings Account (HSA) with employer contribution
  • 401(k) Matching with immediate vesting on employer match
  • Flexible PTO
  • 8 paid holidays and 5 paid days for Annual Holiday Week
  • Quarterly Recharge Fridays (paid days off for mental health recharge)
  • 18 weeks paid parental leave
  • Access to Coaches and Therapists through Modern Health
  • 2 volunteer days per year
  • Commuting benefits
  • Fulltime
Read More
Arrow Right

Sr. Machine Learning Engineer – Context Engineering

GEICO is seeking an experienced Sr. Staff Machine Learning Engineer to join our ...
Location
Location
United States , New York City; Palo Alto; Chevy Chase
Salary
Salary:
115000.00 - 230000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience designing and building AIML platform and systems utilizing components such as vectordb (e.g. Qdrant, Milvus, etc.), data warehouse (e.g. snowflake), streaming platform (e.g. Kafka), relational database (e.g. postgres sql), knowledge graph (e.g. neo4j), workflow orchestration (e.g. Airflow, Temporal)
  • Proficient in Python, Java and similar general-purpose programming languages
  • 3+ years’ experience managing end-to-end software development life cycle (e.g. CICD pipelines, Kubernetes-based deployments, testing, monitoring & alerting, production support etc.) for backend systems and APIs
  • 2+ years’ experience building training, finetuning, real-time/batch inferencing and evaluation systems for AIML models and LLMs, esp. utilizing GPU-powered infrastructure
  • Bachelor’s degree or above in Computer Science, Engineering, Statistics or a related field
Job Responsibility
Job Responsibility
  • Own development of key platform components that power end-to-end GenAI agentic workflows. Examples include knowledge curation & management, search, context management, workflow orchestration, etc.
  • Collaborate with cross-functional teams, including data scientists, ML engineers, software engineers, product managers, designers to gather requirements, define project scope and prioritize feature backlogs for high impact business use cases. Establish pragmatic visions & roadmaps that balance business outcome, product release timelines and engineering excellence
  • Contribute to the selection, evaluation, and implementation of software technologies, tools, and frameworks, balancing build vs. buy, speed to market, maintainability, etc.
  • Lead a small team of engineers for feature & system implementation. Troubleshoot and resolve complex software issues, ensuring optimal platform performance and reliability
  • Mentor and guide junior engineers via code reviews and design sessions, fostering a collaborative and high-performance team culture, elevating AI engineering best practices across the company
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Sr Program Manager Tech - Gen AI

Lead program delivery and client engagement in the domain of AI training and eva...
Location
Location
United States , Sunnyvale; San Francisco; New York
Salary
Salary:
167000.00 - 185500.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of overall experience, with specific familiarity in software engineering, ML engineering, ML ops domains
  • Familiarity and experience in leading or managing client interactions (i.e., AI labs, foundation LLM companies, agentic AI companies) for data annotation, training, evaluation, performance benchmarking in the area of coding and development for foundational AI/LLM/ML is required
  • Experience in client facing service delivery management, solutioning, governance - with external client stakeholders at senior levels and/or their AI teams
  • Familiarity with strategies for delivery and QC processes in this domain is required
  • Track record of driving innovation and thought leadership in AI/ML/LLM training and evaluation services
  • Strong ability to communicate, bring clarity of thought in messaging for senior management as well as broader teams
  • Strong collaboration skills and abilities - working across silos and team structures to drive impact effectively
  • Ability to work in a global organization across locations and time zones
Job Responsibility
Job Responsibility
  • Client engagement for presales support - partner with Sales to interact with prospective clients to shape the project scope, evangelise our capabilities, design the delivery solution, and governance approach
  • Client engagement for program delivery - represent the service delivery organization and collaborate with them in order to drive ongoing governance, enable troubleshooting, find up/cross sell opportunities, bring thought leadership with client teams
  • Program delivery - help to manage US/onshore based delivery of annotation / training/ evaluation of AI/LLM/ML for coding and data areas, where required
  • Innovation and thought leadership - demonstrate deep understanding and expertise of coding and data analytics related AI training/evals including agentic AI with prospective clients
  • Sourcing strategy and implementation inputs - collaborate with our Supply team to help source and develop worker pools in the US/onshore with technical expertise for coding and data related training/evals
  • Tech platform capability and roadmap inputs - collaborate with our Product and Engineering teams to help develop a roadmap for tech and tooling required specific to coding and data analytics related tasking
  • Stakeholder management - represent the coding and data AI capabilities at senior leadership level interactions and forums, evangelise our capabilities, drive sponsorship and backing for initiatives
  • Best practices - continually improve ways of work, enhance delivery maturity, elevate governance and impact
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • Eligible for various benefits
  • Fulltime
Read More
Arrow Right

ML Engineer Sr

This job description indicates the general nature and level of work expected of ...
Location
Location
United States
Salary
Salary:
59.00 - 88.50 USD / Hour
advocatehealth.com Logo
Advocate Health Care
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science, data science, mathematics, statistics, or other related field requiring advanced analytics
  • 5 years in deploying, monitoring, and iterating upon machine learning models in production
  • Strong analytical thinker, able discern business needs
  • Advanced proficiency in Python code writing
  • Extensive knowledge of ML libraries, frameworks, and data structures
  • Proficiency in SQL or other database language
  • Experience with version control systems such as Git, and ML packaging solutions such as Docker
  • Demonstrated self-directed, results oriented and creative approach to problem identification and solving
  • Demonstrated ability to work independently with little supervision
Job Responsibility
Job Responsibility
  • Transform data science prototypes to production quality tools
  • Ensure that machine learning (ML) models generate accurate results for end users
  • Assist with managing ML software and platforms used for computing and model deployment
  • Run tests on ML models and interpret the results
  • Use those results to improve the models as needed
  • Identify changes in data inputs that can affect model performance
  • Communicates effectively with both internal and external clients, explaining highly technical methods and processes to audiences of varying technical backgrounds
  • Continuously studying and researching new ML tools and technologies
  • Participates in evaluation of vendor artificial intelligence solutions, acting as the data science subject matter expert on behalf of the enterprise
What we offer
What we offer
  • Paid Time Off programs
  • Health and welfare benefits such as medical, dental, vision, life, and Short- and Long-Term Disability
  • Flexible Spending Accounts for eligible health care and dependent care expenses
  • Family benefits such as adoption assistance and paid parental leave
  • Defined contribution retirement plans with employer match and other financial wellness programs
  • Educational Assistance Program
  • Fulltime
Read More
Arrow Right

Sr Software Engineer - Python

As a Sr. Software Engineer, you serve as a specialist in the engineering team th...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
blueyonder.com Logo
Blue Yonder
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science is required, Masters is preferred
  • 4+ years of software engineering experience building production software
  • Experience in Frontend technologies, JavaScript, TypeScript, React
  • Good working knowledge of Kubernetes and other virtualized execution technologies
  • 1+ years of experience working on at least one cloud environment, GCP preferred
  • 4+ years of Python programming experience with excellent understanding of Object-Oriented Design & Patterns
  • 3+ years of experience in building REST APIs
  • 1+ Working Experience on Kafka and its integration with Cloud Services
  • 3+ years of Linux scripting experience
  • 1+ years working with traditional and new relational SQL DBMS
Job Responsibility
Job Responsibility
  • Design, architect, implement and help operate the Machine Learning platform
  • Develop and gain insight in the application architecture
  • Distill an abstract architecture into concrete design and influence the implementation
  • Observing inefficiencies, both in cost and reliability, of existing processes
  • Researching alternative solutions using custom or existing open source technologies
  • Designing replacement processes and components
  • Implementing processes, extending and configuring open source components
  • Work with the ML DevOps and Support teams to operate ML platform
  • Helping implement DevOps best practices of in-house and open source components
  • Ensuring smooth operation via monitoring and alerting facilities
  • Fulltime
Read More
Arrow Right

Sr Software Development Engineer - ML OPs

Everseen is a leader in vision AI. We are transforming business operations for g...
Location
Location
Serbia , Belgrade
Salary
Salary:
Not provided
everseen.ai Logo
Everseen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4-5 years of work experience in a relevant role and global SaaS company
  • Bachelors degree or equivalent focusing on the computer science field is preferred
  • Excellent communication and cross-functional collaboration skills
  • Comfort working in ambiguous and fast-evolving environments
  • Expert knowledge of Python
  • Experience with CI/CD tools (e.g., GitLab, Jenkins)
  • Hands-on experience with Kubernetes, Docker, and cloud services
  • Understanding of ML training pipelines, data lifecycle, and model serving concepts
  • Familiarity with workflow orchestration tools (e.g., Airflow, Kubeflow, Ray, Vertex AI, Azure ML)
  • A demonstrated understanding of the ML lifecycle, model versioning, and monitoring
Job Responsibility
Job Responsibility
  • Shares skills, knowledge, and expertise with members of the data engineering team
  • Fosters a culture of collaboration and continuous learning by organizing training sessions, workshops, and knowledge-sharing sessions
  • Collaborates and drive progress with cross-functional teams to design and develop new features and functionalities
  • Ensure that the developed solutions meet project objectives and enhance user experience
  • Have influence over the technology stack and internal technical improvements, contributing to strategic decision-making
  • Based on requirements and a longer-term product and feature strategy, design and implement reusable, testable, efficient, and elegant code
  • Ensure adherence to coding standards and best practices
  • Creates, maintains, and runs unit tests for new and existing applications and services
  • Aims to deliver defect-free and well-tested solutions
  • Analyzes and collect data from various sources such as log files, application stack traces, and thread dumps
  • Fulltime
Read More
Arrow Right

Sr Machine Learning Engineer

We are seeking a Sr Machine Learning Engineer—Amgen’s most senior individual-con...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3-5 years in AI/ML and enterprise software
  • Comprehensive command of machine-learning algorithms—regression, tree-based ensembles, clustering, dimensionality reduction, time-series models, deep-learning architectures (CNNs, RNNs, transformers) and modern LLM/RAG techniques
  • Proven track record selecting and integrating AI SaaS/PaaS offerings and building custom ML services at scale
  • Expert knowledge of GenAI tooling: vector databases, RAG pipelines, prompt-engineering DSLs and agent frameworks (e.g., LangChain, Semantic Kernel)
  • Proficiency in Python and Java
  • containerisation (Docker/K8s)
  • cloud (AWS, Azure or GCP) and modern DevOps/MLOps (GitHub Actions, Bedrock/SageMaker Pipelines)
  • Strong business-case skills—able to model TCO vs. NPV and present trade-offs to executives
  • Exceptional stakeholder management
  • can translate complex technical concepts into concise, outcome-oriented narratives
Job Responsibility
Job Responsibility
  • Engineer end-to-end ML pipelines—data ingestion, feature engineering, training, hyper-parameter optimisation, evaluation, registration and automated promotion—using Kubeflow, SageMaker Pipelines, Open AI SDK or equivalent MLOps stacks
  • Harden research code into production-grade micro-services, packaging models in Docker/Kubernetes and exposing secure REST, gRPC or event-driven APIs for consumption by downstream applications
  • Build and maintain full-stack AI applications by integrating model services with lightweight UI components, workflow engines or business-logic layers so insights reach users with sub-second latency
  • Optimise performance and cost at scale—selecting appropriate algorithms (gradient-boosted trees, transformers, time-series models, classical statistics), applying quantisation/pruning, and tuning GPU/CPU auto-scaling policies to meet strict SLA targets
  • Instrument comprehensive observability—real-time metrics, distributed tracing, drift & bias detection and user-behaviour analytics—enabling rapid diagnosis and continuous improvement of live models and applications
  • Embed security and responsible-AI controls (data encryption, access policies, lineage tracking, explainability and bias monitoring) in partnership with Security, Privacy and Compliance teams
  • Contribute reusable platform components—feature stores, model registries, experiment-tracking libraries—and evangelise best practices that raise engineering velocity across squads
  • Perform exploratory data analysis and feature ideation on complex, high-dimensional datasets to inform algorithm selection and ensure model robustness
  • Partner with data scientists to prototype and benchmark new algorithms, offering guidance on scalability trade-offs and production-readiness while co-owning model-performance KPIs
Read More
Arrow Right