Job Description:
At General Motors, our product teams are redefining mobility. Through a human-centered design process, we create vehicles and experiences that are designed not just to be seen, but to be felt. We’re turning today’s impossible into tomorrow’s standard —from breakthrough hardware and battery systems to intuitive design, intelligent software, and next-generation safety and entertainment features. Every day, our products move millions of people as we aim to make driving safer, smarter, and more connected, shaping the future of transportation on a global scale. The Data Scaling team owns the Data Flywheel for AV Foundation model development and successive fine tuning. It defines the composition of the data that is needed for the AV to learn behaviors at scale and deliver the driving behaviors necessary for the product success. The team owns the definition and processes for data quality across the data loop. The team directly works on and delivers ML models to the product that successively go up the Data Scaling curves, thereby directly impacting AV product performance through smart use of data. As part of this work, the team builds scalable systems and pipelines that attempt to 10x the data used, its diversity and impact on the models with successive major releases. The team uses existing very large datasets that GM has access to internally as well as defines the next generation of highest value datasets that GM continues to collect – both from real driving by GM and retail fleets, but also synthetic sim-based datasets. Why Join Us? Scale up AV foundation model pre-training and fine-tuning with data to its maximum capacity – into billions+ of examples - across different sources, delivering the maximum value to the model through every additional example. Work with cutting-edge technology and a collaborative, high-impact team of AI/ML engineers, data scientists and engineers who are passionate about leveraging advanced AI techniques to drive innovation for L2, L3 and L4 applications. Contribute to the safety, reliability, and scalability of next-generation autonomous vehicles. As a Staff AI/ML Engineer in the Embodied AI Data Foundations organization, you will serve as a senior individual contributor developing machine learning solutions, maximally leveraging multiple sources of data including real and synthetic datasets, that directly impact autonomous driving performance. You will design and implement advanced data curation and model training recipes the will deliver ML models that enabling safe and reliable vehicle behavior across diverse real-world scenarios. In this role, you will partner closely with cross-functional engineering teams, contribute to core technical direction within your domain, and support the growth of engineers through technical collaboration and mentorship. You will help translate research into scalable onboard and offboard ML solutions while contributing to the continuous improvement of our autonomy stack.