CrawlJobs Logo

Senior ML Systems Engineer

United States, Sunnyvale Employment contract 170600.00 - 261300.00 USD / Year · Job Posted June 30, 2026
Apply Position
Job Link Share

Job Description

Help teach our self-driving vehicles how to see and understand the world! The Data Labeling Engineering team designs, builds, and operates hybrid human/machine data labeling tools and pipelines that power autonomous vehicle machine learning models within General Motors' AV organization. We operate in the intersection of software engineering, data engineering, and AI/ML, defining the strategies, tooling, and quality controls that create reliable training data at scale. Our tools and platform are used by thousands of users and consumers. We own a modern full-stack architecture including TypeScript/React, Python, GraphQL, Golang, and ML model services, which powers data-annotation pipelines and machine-led training data solutions at foundation-model scale. We partner closely across AI/ML engineers, Product Operations, Product Management, Data Science, and other ML Platform groups. This role is ideal for an engineer who wants end-to-end ownership of meaningful pieces of the platform, growth toward technical leadership, and direct impact on systems that unblock the next generation of AV capabilities.

Job Responsibility

  • Build high-impact labeling experiences
  • Level up how ML teams work with data
  • Apply ML to labeling itself
  • Champion AI-assisted engineering
  • Own projects end-to-end
  • Collaborate across the AV stack

Requirements

  • 6+ years of experience building robust distributed platforms and applications
  • Hands-on experience leveraging AI tools (agentic coding, search, documentation generators, etc) to accelerate understanding, implementation, debugging, and delivery of new capabilities
  • Proficiency in writing and reviewing high-quality, scalable, and performant full-stack code using technologies and languages like Python, TypeScript, Go, React, SQL, Redux, GraphQL, WebGL
  • Solid understanding of relational databases, data modeling, and API design
  • Strong fundamentals in object-oriented design and design patterns, data structures, algorithms, and engineering best practices (TDD, code quality, observability, CI/CD)
  • Experience developing and operating cloud-based applications

Nice to have

  • Experience using modern web APIs (Service Workers, Cache Storage, IndexedDB, etc.) in data-intensive or visualization-heavy applications
  • A track record of close collaboration with customers, product managers, designers, and user experience researchers
  • Experience with computer vision, machine learning, or data-centric AI projects — especially where labeled data, data quality, or autolabeling loops were central to the work
  • Familiarity with data labeling platforms or tools used by large labeling workforces (e.g., annotation UIs, workflow engines, quality systems)
  • Experience with A/B testing and telemetry/observability systems to measure impact and reliability
  • Proficiency in writing and reviewing high-quality, scalable, and performant code using TypeScript, React, Redux, GraphQL, WebGL, or similar frontend technologies

What we offer

  • Medical
  • Dental
  • Vision
  • Health Savings Account
  • Flexible Spending Accounts
  • Retirement savings plan
  • Sickness and accident benefits
  • Life insurance
  • Paid vacation & holidays
  • Tuition assistance programs
  • Employee assistance program
  • GM vehicle discounts

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior ML Systems Engineer

8 matching positions

Senior ML Systems Engineer, Frameworks & Tooling

We’re looking for a senior engineer to help build, maintain and evolve the train...
Location
Location
Salary
Salary:
Not provided
cohere.com Logo
Cohere
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong engineering experience in large-scale distributed training or HPC systems
  • Deep familiarity with JAX internals, distributed training libraries, or custom kernels/fused ops
  • Experience with multi-node cluster orchestration (Slurm, Ray, Kubernetes, or similar)
  • Comfort debugging performance issues across CUDA/NCCL, networking, IO, and data pipelines
  • Experience working with containerized environments (Docker, Singularity/Apptainer)
  • A track record of building tools that increase developer velocity for ML teams
  • Excellent judgment around trade-offs: performance vs complexity, research velocity vs maintainability
  • Strong collaboration skills — you’ll work closely with infra, research, and deployment teams
Job Responsibility
Job Responsibility
  • Build and own the training framework responsible for large-scale LLM training
  • Design distributed training abstractions (data/tensor/pipeline parallelism, FSDP/ZeRO strategies, memory management, checkpointing)
  • Improve training throughput and stability on multi-node clusters (e.g., GB200/300, AMD, H200/100)
  • Develop and maintain tooling for monitoring, logging, debugging, and developer ergonomics
  • Collaborate closely with infra teams to ensure our cluster, container environments, and hardware configurations support high-performance training
  • Investigate and resolve performance bottlenecks across the ML systems stack
  • Build robust systems that ensure reproducible, debuggable, large-scale runs
What we offer
What we offer
  • An open and inclusive culture and work environment
  • Work closely with a team on the cutting edge of AI research
  • Weekly lunch stipend, in-office lunches & snacks
  • Full health and dental benefits, including a separate budget to take care of your mental health
  • 100% Parental Leave top-up for up to 6 months
  • Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
  • Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
  • 6 weeks of vacation (30 working days!)
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer, ML Data Systems

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet t...
Location
Location
United States , Mountain View
Salary
Salary:
137871.00 - 172339.00 USD / Year
khanacademy.org Logo
Khan Academy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
  • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
  • Familiarity with machine learning workflows — from training data preparation to evaluation
  • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
  • Attention to detail and an obsession with data quality and reproducibility
  • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
  • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.
Job Responsibility
Job Responsibility
  • Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
  • Clean, normalize, and enrich data while preserving semantic meaning and consistency
  • Prepare and format datasets for human labeling, and integrate results into ML datasets
  • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
  • Implement automated tests and validation to detect data drift or labeling inconsistencies
  • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
  • Contribute to shared tools and documentation for dataset management and AI evaluation
  • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.
What we offer
What we offer
  • Competitive salaries
  • Ample paid time off as needed
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.
  • Fulltime
Read More
Arrow Right

Senior ML Infrastructure / ML DevOps Engineer

We are looking for a Senior ML Infrastructure / DevOps Engineer who loves Linux,...
Location
Location
Salary
Salary:
Not provided
Pathway
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Former or current Linux / systems / network administrator comfortable living in the shell and debugging at OS and network layers (systemd, filesystems, iptables/security groups, DNS, TLS, routing)
  • 5+ years of experience in DevOps/SRE/Platform/Infrastructure roles running production systems, ideally with high‑performance or ML workloads
  • Deep familiarity with Linux as a daily driver, including shell scripting and configuration of clusters and services
  • Strong experience with workload management, containerization, and orchestration (Slurm, Docker, Kubernetes) in production environments
  • Solid understanding of CI/CD tools and workflows (GitHub Actions, GitLab CI, Jenkins, etc.), including building pipelines from scratch
  • Hands-on cloud infrastructure experience (AWS, GCP, Azure), especially around GPU instances, VPC/networking, storage, and managed ML services (e.g., SageMaker HyperPod, Vertex AI)
  • Proficiency with infrastructure as code (Terraform, CloudFormation, or similar) and a bias toward automation over manual operations
  • Experience with monitoring and logging stacks (Grafana, Prometheus, Loki, CloudWatch, or equivalents)
  • Familiarity with ML pipeline and experiment orchestration tools (MLflow, Kubeflow, Airflow, Metaflow, etc.) and with model/version management
  • Solid programming skills in Python, plus the ability to read and debug code that uses common ML libraries (PyTorch, TensorFlow) even if you are not a full‑time model developer
Job Responsibility
Job Responsibility
  • Design, operate, and scale GPU and CPU clusters for ML training and inference (Slurm, Kubernetes, autoscaling, queueing, quota management)
  • Automate infrastructure provisioning and configuration using infrastructure‑as‑code (Terraform, CloudFormation, cluster‑tooling) and configuration management
  • Build and maintain robust ML pipelines (data ingestion, training, evaluation, deployment) with strong guarantees around reproducibility, traceability, and rollback
  • Implement and evolve ML‑centric CI/CD: testing, packaging, deployment of models and services
  • Own monitoring, logging, and alerting across training and serving: GPU/CPU utilization, latency, throughput, failures, and data/model drift (Grafana, Prometheus, Loki, CloudWatch)
  • Work with terabyte‑scale datasets and the associated storage, networking, and performance challenges
  • Partner closely with ML engineers and researchers to productionize their work, translating experimental setups into robust, scalable systems
  • Participate in on‑call rotation for critical ML infrastructure and lead incident response and post‑mortems when things break
What we offer
What we offer
  • Intellectually stimulating work environment
  • Be a pioneer: you get to work with realtime data processing & AI
  • Work in one of the hottest AI startups, with exciting career prospects
  • Team members are distributed across the world
  • Responsibilities and ability to make significant contribution to the company’s success
  • Inclusive workplace culture
  • Fulltime
Read More
Arrow Right

Senior Engineer / Lead Engineer - Virtual Engineering - AI ML

Sponsorship:  GM DOES NOT PROVIDE IMMIGRATION-RELATED SPONSORSHIP FOR THIS ROLE....
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Masters Degree Mechanical/Automobile/Production /Mechatronics Engineering discipline or similar
  • 5+ years in Automotive Manufacturing / Manufacturing Engineering Experience
  • 1+ year experience in implementing AI/ML solutions in Automotive use cases
  • Should have executed at least 2 end-to-end projects in the text or Image data domain (from problem definition to deployment)
  • Strong programming skills in Python
  • Proficiency with ML/DL frameworks like Scikit-learn, TensorFlow, PyTorch, XGBoost
  • Solid understanding of statistics, probability, and linear algebra
  • Experience in data preprocessing, feature engineering, ETL and Exploratory Data Analysis (EDA)
  • Experience with MLOps platforms (MLflow, Kubeflow, Vertex AI, Azure ML)
  • Knowledge of ML model evaluation
Job Responsibility
Job Responsibility
  • Collaborate with stakeholders to understand business problems in the in the Manufacturing Engineering and Operations space and solve them using ML methodologies
  • Design, develop, and fine-tune AI/ML models for classification, regression, clustering, and recommendation systems
  • Work with MLOps tools to automate workflows, CI/CD pipelines, and model monitoring
  • Evaluate, validate, and benchmark model performance using appropriate metrics
  • Deploy AI models into production environments in collaboration with IT/AI teams
  • Establish monitoring and maintenance processes to ensure model accuracy over time
  • Ensure that all AI solutions comply with organizational data security, confidentiality, and regulatory requirements
  • Document workflows, results, and lessons learned for organizational knowledge sharing
  • Stay updated on advancements in ML model evaluation, ML frameworks, end-to-end ML pipelines
  • Fulltime
Read More
Arrow Right

Senior Ml Engineer

Location
Location
Colombia , Medellín, Antioquia;Bogotá, Capital District;Cali, Valle del Cauca;Barranquilla;Bucaramanga, Santander
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • ML Fundamentals: supervised, unsupervised, and reinforcement learning
  • Model Development: feature engineering, model training, evaluation, hyperparameter tuning, and validation
  • ML Frameworks: classical ML libraries, TensorFlow, PyTorch, or similar frameworks
  • Deep Learning: CNNs, RNNs, Transformers
  • LLM Applications: Experience building production LLM-based applications
  • Prompt Engineering: Ability to design effective prompts and chain-of-thought strategies
  • RAG Systems: Experience building retrieval-augmented generation architectures
  • Vector Databases: Familiarity with embedding models and vector search
  • LLM Evaluation: Experience with evaluation metrics and techniques for LLM outputs
  • Python: Advanced proficiency in Python for ML applications
Job Responsibility
Job Responsibility
  • Design and implement end-to-end ML solutions from experimentation to production
  • Build scalable ML pipelines and infrastructure
  • Optimize model performance, efficiency, and reliability
  • Write clean, maintainable, production-quality code
  • Conduct rigorous experimentation and model evaluation
  • Troubleshoot and resolve complex technical challenges
  • Mentor junior and mid-level ML engineers
  • Conduct code reviews and provide constructive feedback
  • Share knowledge through documentation, presentations, and workshops
  • Collaborate with cross-functional teams (DevOps, Data Engineering, SAs)
What we offer
What we offer
  • Long-term B2B collaboration
  • Fully remote setup
  • A budget for your medical insurance
  • Paid sick leave, vacation, public holidays
  • Continuous learning support, including unlimited AWS certification sponsorship
  • Fulltime
Read More
Arrow Right

Senior ML Engineer (Audio)

Uber AI Solutions is one of Uber’s biggest bets with the ambition to build one o...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in building ML models for audio and speech intelligence
  • Proficiency in ASR, Speech Quality Evaluation, Audio Event Detection, and GenAI Audio Labeling
  • Ability to integrate advanced ML models for ML-assisted annotations
  • Collaboration skills with product managers, program managers, and cross-functional teams
Job Responsibility
Job Responsibility
  • Integrate advanced ML models to enable ML-assisted annotations for ASR, Speech Quality Evaluation, Audio Event Detection, and GenAI Audio Labeling
  • Optimize the Uber AI Solutions gig marketplace through intelligent supply and demand matching
  • Accelerate human-in-the-loop data annotation with automation
  • Develop robust automated evaluation systems
  • Collaborate with product managers, program managers, and cross-functional teams
What we offer
What we offer
  • Accommodations may be available based on religious and/or medical conditions, or as required by applicable law
  • Fulltime
Read More
Arrow Right

Senior ML Engineer (GenAI, AWS)

Provectus helps companies adopt ML/AI to transform the ways they operate, compet...
Location
Location
Colombia , Medellín; Bogotá; Cali; Barranquilla; Bucaramanga
Salary
Salary:
Not provided
provectus.com Logo
Provectus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • ML Fundamentals: supervised, unsupervised, and reinforcement learning
  • Model Development: feature engineering, model training, evaluation, hyperparameter tuning, and validation
  • ML Frameworks: classical ML libraries, TensorFlow, PyTorch, or similar frameworks
  • Deep Learning: CNNs, RNNs, Transformers
  • LLM Applications: Experience building production LLM-based applications
  • Prompt Engineering: Ability to design effective prompts and chain-of-thought strategies
  • RAG Systems: Experience building retrieval-augmented generation architectures
  • Vector Databases: Familiarity with embedding models and vector search
  • LLM Evaluation: Experience with evaluation metrics and techniques for LLM outputs
  • Python: Advanced proficiency in Python for ML applications
Job Responsibility
Job Responsibility
  • Design and implement end-to-end ML solutions from experimentation to production
  • Build scalable ML pipelines and infrastructure
  • Optimize model performance, efficiency, and reliability
  • Write clean, maintainable, production-quality code
  • Conduct rigorous experimentation and model evaluation
  • Troubleshoot and resolve complex technical challenges
  • Mentor junior and mid-level ML engineers
  • Conduct code reviews and provide constructive feedback
  • Share knowledge through documentation, presentations, and workshops
  • Collaborate with cross-functional teams (DevOps, Data Engineering, SAs)
What we offer
What we offer
  • Long-term B2B collaboration
  • Fully remote setup
  • A budget for your medical insurance
  • Paid sick leave, vacation, public holidays
  • Continuous learning support, including unlimited AWS certification sponsorship
  • Fulltime
Read More
Arrow Right

Senior ML Engineer

We are the global test and automation specialists, powering next-generation tech...
Location
Location
United States , North Reading
Salary
Salary:
158600.00 - 253700.00 USD / Year
teradyne.com Logo
Teradyne
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in machine learning, applied AI, or related fields
  • Hands-on experience fine-tuning large language models
  • Experience with reinforcement learning (e.g., policy gradients, PPO, actor-critic methods)
  • Experience designing reward models or evaluation systems
  • Strong software engineering skills (Python, distributed systems familiarity)
  • Experience building production ML systems (MLOps, monitoring, deployment)
  • Ability to work cross-functionally with product, software, and hardware teams
  • Strong communication skills
  • comfortable engaging directly with customers & stakeholders
  • Computer vision skills in manufacturing inspection: defect detection, etc.
Job Responsibility
Job Responsibility
  • Own the technical direction and quality of all ML model development across Teradyne's AI initiatives, setting and enforcing engineering standards across the team
  • Lead end-to-end development of production ML systems: data ingestion and feature engineering, model architecture design, training pipelines, evaluation frameworks, and deployment
  • Design and implement novel ML approaches tailored to Teradyne's unique data domain including time-series parametric test data (STDF/TEMS), wafer map analysis, etc.
  • Drive applied research and model innovation, explore and evaluate new architectures, algorithms, and training methodologies, and translate promising approaches into production systems
  • Develop and maintain rigorous model evaluation frameworks, including validation methodologies, risk quantification, and production monitoring strategies
  • Lead technical design reviews
  • serve as final arbiter of ML architecture and modeling decisions for the team
  • Build and maintain production ML systems with a strong focus on reliability, scalability, and performance in Teradyne's ATE and manufacturing environments
  • Partner directly with customers and application engineers to understand real-world debug workflows and translate them into ML solutions
  • Mentor and develop junior ML engineers
What we offer
What we offer
  • medical
  • dental
  • vision
  • Flexible Spending Accounts
  • retirement savings plans
  • life and disability insurance
  • paid vacation & holidays
  • tuition assistance programs
  • Fulltime
Read More
Arrow Right