CrawlJobs Logo

Senior Machine Learning Engineer, AI Platform

United States; Canada Employment contract 139000.00 - 218000.00 USD / Year · Job Posted June 16, 2026
Apply Position
Job Link Share

Job Description

The AI Platform team is responsible for building the foundational infrastructure that powers intelligent experiences across Mozilla products. This includes model training pipelines, high-throughput inference services, GPU orchestration, and secure, privacy-respecting AI systems that operate reliably at global scale. We’re looking for a Machine Learning Engineer with a strong platform mindset to help design, build, and operate Mozilla’s AI platform.

Job Responsibility

  • Design, build, and operate core AI platform components used to train, deploy, and serve machine learning models in production environments
  • Own model serving and inference workflows end-to-end, driving improvements in reliability, scalability, performance, and operational excellence
  • Lead efforts to optimize inference systems for throughput, latency, and cost efficiency across CPU and GPU workloads
  • Design and manage GPU-based inference and training workloads, including performance tuning, capacity planning, and resource utilization optimization
  • Own and improve critical parts of the model lifecycle, including packaging, versioning, testing strategies, validation, and deployment automation
  • Implement and evolve observability practices (metrics, logging, tracing, alerting) to improve visibility and operational resilience of ML services and pipelines
  • Partner closely with product, infrastructure, security, and data teams to design scalable platform capabilities that enable AI-powered features
  • Contribute to technical design discussions, propose architectural improvements, and mentor junior engineers through code reviews and knowledge sharing
  • Participate in and help improve operational processes, including incident response, on-call rotations, and post-incident reviews

Requirements

  • Bachelor’s degree with 4–6 years of relevant industry experience, or Master’s degree with significant hands-on experience building and operating production ML systems, or work experience equivalent
  • Strong experience developing in Python for machine learning systems, backend services, or distributed data processing
  • Proven experience deploying and operating ML workloads in cloud environments, including production-grade infrastructure
  • Solid understanding of model serving architectures, inference pipelines, and performance tradeoffs (latency, throughput, cost, scaling strategies)
  • Hands-on experience working with GPU-based workloads and accelerated computing in production settings
  • Experience designing CI/CD pipelines and development workflows that support reliable ML system deployment
  • Ability to independently scope and drive technical initiatives while balancing product and operational priorities
  • Strong problem-solving skills and the ability to debug performance and reliability issues in distributed systems
  • Clear and effective communication skills, with experience collaborating across engineering, product, and infrastructure teams

Nice to have

  • Experience implementing inference optimization strategies such as batching, quantization, compilation, model conversion, or hardware-specific tuning
  • Familiarity with containerization and orchestration systems (e.g., Docker, Kubernetes) in production environments
  • Experience designing observability systems for distributed services, including metrics strategy and performance profiling
  • Exposure to privacy-preserving ML techniques, security best practices, or responsible AI system design
  • Contributions to open-source ML infrastructure projects or leadership in building reusable internal ML tooling

What we offer

  • Generous performance-based bonus plans
  • Rich medical, dental, and vision coverage
  • Generous retirement contributions with 100% immediate vesting
  • Quarterly all-company wellness days
  • Country specific holidays plus a day off for your birthday
  • One-time home office stipend
  • Annual professional development budget
  • Quarterly well-being stipend
  • Considerable paid parental leave
  • Employee referral bonus program
  • Other benefits (life/AD&D, disability, EAP, etc.)

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Machine Learning Engineer, AI Platform

8 matching positions

Senior Machine Learning Engineer, Platform

We seek an outstanding, creative, and passionate Machine Learning Platform Engin...
Location
Location
United States , San Jose
Salary
Salary:
229500.00 - 367100.00 USD / Year
roku.com Logo
Roku
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience building software solutions to concrete problems
  • Strong CS fundamentals. Should be able to write an algorithm with ease
  • Fluent with one of high-level programming languages like Java, Scala, Kotlin or Python
  • Worked with big data systems (Spark, Kafka, Flink, S3, AirFlow)
  • Familiar with model ML framework and tools: Ray, PyTorch, HuggingFace, AWS Sagemaker
  • AI literacy and curiosity. You have either tried Gen AI in your previous work or outside of work or are curious about Gen AI and have explored it
  • MS in Computer Science or related field
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable platform services: feature store, real-time inference services, vector DBs etc., that serve millions of transactions per second
  • Run and monitor online AB tests via robust platform services, analyzing platform metrics and business KPIs to optimize recommendation system performance
  • Collaborate closely with US-based engineering and cross-functional teams to translate business requirements into modular platform components and APIs
  • Enhance and evolve the ML platform ecosystem to support high developer velocity, system scalability, and adaptability to future business needs
  • Contribute to onboarding, training, and mentoring new team members on emerging platform engineering best practices and technologies
What we offer
What we offer
  • health insurance
  • equity awards
  • life insurance
  • disability benefits
  • parental leave
  • wellness benefits
  • paid time off
  • global access to mental health and financial wellness support and resources
  • commuter benefits
  • retirement options (401(k)/pension)
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Platform Engineer

We are looking for a Senior Machine Learning Platform Engineer to join the growi...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 200000.00 USD / Year
strava.com Logo
Strava
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have worked on complex, ambiguous platform challenges and broken them down into manageable tasks with both strategies and tactical execution
  • Demonstrated technical leadership in leading projects and the ability to mentor and grow early-career team members
  • Have demonstrated strong interpersonal and communication skills, and a collaborative approach to drive business impact across teams
  • Have worked with a variety of MLOps tools that fulfill different ML needs (like FastAPI, LitServe, Metaflow, MLflow, Kubeflow, Feast)
  • Are experienced in production ML model operational excellence and best practices, like automated model retraining, performance monitoring, feature logging, A/B testing
  • Experience with generative AI technologies around LLM evaluation, vector stores, and agent frameworks
  • Have built backend production tools and services on cloud environments like (but not limited to) AWS, using languages Python, Terraform, and other similar technologies
  • Have built and worked on data pipelines using large scale data technologies (like Spark, SQL, Snowflake)
  • Have experience building, shipping, and supporting ML models in production at scale
  • Have experience with exploratory data analysis and model prototyping, using languages such as Python or R and tools like Scikit learn, Pandas, Numpy, Pytorch, Tensorflow, Sagemaker
Job Responsibility
Job Responsibility
  • Own End to End Systems: Drive key projects to power AI/ML at Strava end-to-end from gathering stakeholders requirements to ground up developer to driving adoption and optimizing the experience
  • Interact with a Rich and Large Dataset: Explore and help leverage Strava’s extensive unique fitness and geo datasets from millions of users to extract actionable insights, inform product decisions, and optimize existing features
  • Contribute to a Well Loved Consumer Product: Work at the intersection of AI and fitness to help launch and maintain product experiences that will be used by tens of millions of active people worldwide
What we offer
What we offer
  • Offers Equity
  • Fulltime
Read More
Arrow Right

Senior Platform Machine Learning Engineer

Machine learning is the crucial enabler for every financial service EarnIn provi...
Location
Location
United States , Mountain View
Salary
Salary:
232200.00 - 283800.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field, or relevant equivalent experience
  • 4+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms like AWS Sagemaker, Databricks, or GCP Vertex AI
  • Experience with LLM Ops, foundation model APIs, and AI engineering
  • Familiarity with data pipeline and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest machine learning and platform engineering industry trends
Job Responsibility
Job Responsibility
  • Design, build, and maintain the ML and AI platform and tools to support the end-to-end machine learning lifecycle
  • Work closely with other machine learning engineers to understand their workflows, optimize model training and deployment processes, and ensure the reproducibility of results
  • Ensure scalability, reliability, cost efficiency, and ease of use of the machine learning platform
  • Contribute to evaluating and adopting new technologies and tools to enhance our machine-learning capabilities
  • Set examples of outstanding operational excellence. Be the catalyst for step-jump changes
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer, Generative AI Products

Lead comprehensive applications/web development for highly complex projects; typ...
Location
Location
United States , Boston
Salary
Salary:
Not provided
hbs.edu Logo
HBS
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of seven years’ post-secondary education or relevant work experience
  • Bachelor's degree in mathematics, physics, computer science, engineering, statistics, or an equivalent technical discipline desired
  • Minimum of five years’ software development experience with Python and SQL
  • Minimum of three years’ experience building pipelines to deploy NLP and deep learning models into production in a cloud environment
  • Minimum three years’ experience using PyTorch, Tensorflow, or MXNet, along with optimizing code for GPU clusters
  • Experience building advanced workflows such as retrieval augmented generation, model chaining, dynamic prompting, PEFT/SFT, etc. using Langchain and similar tools
  • Experience establishing model guardrails and developing bias detection and mitigation techniques for AI applications using tools such as NeMo
  • Experience with various embedding models and setting up and tuning vector databases to improve performance of semantic search and retrieval systems
  • Understand the underlying fundamentals such as Transformers, Self-Attention mechanisms that form the theoretical foundation of LLMs
  • Experience working with a variety of relational SQL and NoSQL databases, big data tools: Hadoop, Spark, Kafka
Job Responsibility
Job Responsibility
  • Build trust and collaboration by being present on-site and engaging directly with colleagues and various constituents
  • Architect, build, maintain, and improve new and existing suite of GenAI applications and their underlying systems
  • Automate machine learning pipelines, monitor performance and costs, and optimize models by using techniques such as LoRA/QLoRA
  • Establish reusable frameworks to streamline model building, deployment and monitoring
  • Incorporate comprehensive monitoring, logging, tracing, and alerting mechanisms
  • Build guardrails, compliance rules and oversight workflows into the GenAI application platform, such as establishing approval chains for model updates and staged rollout for production releases
  • Develop templates, guides and sandbox environments for easy onboarding of new contributors and experimentation with new techniques
  • Ensure development of user-facing applications in the GenAI application platform is easy and safe by enforcing rigorous validation testing before publishing user-generated models and implement a clear peer review process of applications
  • Use your entrepreneurial spirit to identify new opportunities to optimize business processes, improve consumer experiences, and prototype solutions to demonstrate value
  • Work closely with data scientists and analysts to create and deploy new product features online and in mobile apps
What we offer
What we offer
  • Generous paid time off including parental leave
  • Medical, dental, and vision health insurance coverage starting on day one
  • Retirement plans with university contributions
  • Wellbeing and mental health resources
  • Support for families and caregivers
  • Professional development opportunities including tuition assistance and reimbursement
  • Commuter benefits, discounts and campus perks
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer - Multimodal - AI Teams

At Doctolib, we're on a mission to transform the way healthcare is delivered by ...
Location
Location
France , Paris
Salary
Salary:
Not provided
doctolib.fr Logo
Doctolib
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Ph.D. degree (preferred) or MSc in Computer Science, Machine Learning, Computational Medicine, or a related field
  • At least 5 years of experience working with deep learning models (including training), in particular computer vision and language models, voice is a plus
  • Track record of deploying models in production (with active users), with high safety requirements, ideally in the medical domain
  • Collaborative and organized mindset: collaborate with other members of your team and medical experts. Ready to have ownership on projects, taking into account team dependencies and timelines, making progress step by step
  • Proficiency with deep learning frameworks such as PyTorch, or JAX
Job Responsibility
Job Responsibility
  • Design an evaluation framework and prepare datasets for the tasks we’re solving
  • Benchmark a baseline model against open-source and proprietary models
  • Scope a strategy to develop a model that reaches our performance targets
  • Lead experiments and report results
  • Deploy your model in production guided by our ML platform team
What we offer
What we offer
  • Free comprehensive health insurance for you and your children
  • Parent Care Program: receive one additional month of leave on top of the legal parental leave
  • Free mental health and coaching services through our partner Moka.care
  • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
  • Work from abroad for up to 10 days per year thanks to our flexibility days policy
  • Work Council subsidy to refund part of sport club membership or creative class
  • Up to 14 days of RTT
  • Lunch voucher with Swile card
  • Relocation support for international mobilities
  • Fulltime
Read More
Arrow Right

Senior Staff Machine Learning Engineer – Moonshot AI

The Moonshot AI team sits within Uber AI Solutions where we are building an ente...
Location
Location
United States , Sunnyvale
Salary
Salary:
267000.00 - 297000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of industry experience developing and shipping production machine learning models
  • Ph.D., MS, or Bachelor's degree in Computer Science, Machine Learning, or a closely related discipline
  • Proven track record of technical leadership on large-scale ML initiatives with measurable business impact
  • Deep expertise across multiple areas: Computer Vision, Natural Language Processing, Deep Learning, and Generative AI
  • Strong proficiency with modern ML frameworks (PyTorch, TensorFlow, JAX) and programming languages
  • Extensive experience with distributed training infrastructure, large-scale model development, and ML platform design
  • Demonstrated ability to collaborate with product, engineering, and data science leadership on technical roadmaps and strategic priorities
  • Excellent problem-solving abilities with deep ML methodology expertise
Job Responsibility
Job Responsibility
  • Shape the technical vision and roadmap for Moonshot AI's ML initiatives
  • Architect foundational ML platforms and systems for marketplace optimization and annotation automation
  • Drive end-to-end ML solutions from conception through production deployment
  • Lead GenAI innovation: design and implement cutting-edge systems using custom SLMs, computer vision, and LLMs
  • Advance AI research capabilities: establish research direction, design benchmarks, contribute to research and publications
  • Build industry-leading evaluation frameworks: architect LLM-as-Judge systems and automated quality assessment platforms
  • Provide technical leadership across Uber AI Solutions
  • Mentor and develop engineering talent
  • Enable cross-functional impact
What we offer
What we offer
  • Bonus program
  • Equity award & other types of comp
  • 401(k) plan
  • Various benefits
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer, ML Training Platform

Location
Location
United States
Salary
Salary:
216700.00 - 303400.00 USD / Year
Reddit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience, with a focus on Platform Engineering, ML Infrastructure, or Backend Systems
  • Deep Kubernetes Expertise: You know K8s beyond just 'deploying pods.' You understand CRDs, Controllers and the Operator pattern
  • Jupyter Ecosystem Knowledge: Experience customizing JupyterHub, JupyterLab extensions, or building similar interactive computing platforms
  • Strong Coding Skills: Proficiency in Python (for the ML ecosystem) and Go (for Kubernetes controllers/infrastructure tooling)
  • GPU Experience: Hands-on practice with CUDA environments, GPU virtualization/containerization, and doing it all within Kubernetes
  • Cloud Provider Experience: Familiarity with both managed ML offerings (Vertex AI, Sagemaker, etc) and building custom ML components in AWS and/or GCP
  • Experience working with distributed training frameworks, including Ray and Kubernetes
  • Comfortable with distributed systems, big data (Petabyte scale) and data-intensive systems
  • Strong focus on scalability, reliability, performance, and ease of use. You are an undying advocate for platform users and have a deep intuition for the machine learning development lifecycle
  • Strong organizational & communication skills
Job Responsibility
Job Responsibility
  • Lead the building, testing, and maintenance of ML training infrastructure at Reddit
  • Play a pivotal role in designing, building, and optimizing the infrastructure and tooling required to support large-scale machine learning workflows
  • Evolve the MLE experience, from provisioning interactive GPU environments through large-scale training, supporting on-demand and self-service workflows
  • Kubernetes Automation: Write custom Kubernetes Controllers and Operators to manage the lifecycle of interactive Jupyter workspaces and long-running ML training jobs, handle auto-idling, and ensure fault tolerance
  • GPU Orchestration: Work with the underlying compute team to ensure MLEs have efficient access to training hardware resources and handle resource contention gracefully
  • Developer Experience (DevX): Treat internal MLEs as your customers. Conduct user research, reduce friction in the 'Idea-to-Prototype' loop, and standardize software environments (Docker images, Python dependency management)
What we offer
What we offer
  • Comprehensive Healthcare Benefits and Income Replacement Programs
  • 401k Match
  • Family Planning Support
  • Gender-Affirming Care
  • Mental Health & Coaching Benefits
  • Flexible Vacation & Reddit Global Days off
  • Generous paid Parental Leave
  • Paid Volunteer time off
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer – Ranking & Recommendations (Generative AI)

The Shopping Ranking Team mission is enabling eaters to effortlessly make shoppi...
Location
Location
United States , New York; Seattle; San Francisco; Sunnyvale
Salary
Salary:
202000.00 - 224000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree or equivalent in Computer Science, Engineering, Mathematics or related field, with 4+ years of full-time engineering experience
  • 4+ years of ML experience and building ML models
  • Experience working with multiple multi-functional teams(product, science, product ops etc)
  • Expertise in one or more object-oriented programming languages (e.g. Python, Go, Java, C++)
  • Experience with big-data architecture, ETL frameworks and platforms, such as HDFS, Hive, MapReduce, Spark, etc
  • Working knowledge of latest ML technologies, and libraries, such as PyTorch, TensorFlow, Ray, etc
  • Proven track records of being a fast learner and go-getter, with willingness to get out of the comfort zone
Job Responsibility
Job Responsibility
  • Design and build Machine Learning models in Ranking and Recommendation domain
  • Productionize and deploy these models for real-world application
  • Review code and designs of teammates, providing constructive feedback
  • Collaborate with Product and cross-functional teams to brainstorm new solutions and iterate on the product
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • All full-time employees are eligible to participate in a 401(k) plan
  • Eligible for various benefits (details at provided link)
  • Fulltime
Read More
Arrow Right