CrawlJobs Logo

Senior ML Compiler Engineer

gm.com Logo

General Motors

Location Icon

Location:
United States , Austin

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

128700.00 - 261300.00 USD / Year

Job Description:

About the Mission: GM’s vision of Zero Crashes, Zero Emissions, and Zero Congestion guides everything we do in autonomous and assisted driving. The AV organization is building advanced automated driving technologies, including Level 4–capable fully self-driving systems, to move us toward safer, more sustainable, and more accessible mobility. For the AI Kernels & Compilers team, that mission shows up in the details: turning cutting‑edge perception, prediction, and planning research into production‑grade software that can run efficiently and reliably on real vehicles at scale. We pioneer new approaches to model export, kernel development, and performance engineering so that every cycle on our accelerators translates into better situational awareness, faster reaction times, and more robust behavior on the road. If you want your compiler and kernels work to directly influence how automated vehicles understand and react to the world — while operating at the safety, reliability and scale of a company like GM — this is where that impact becomes real. About the Team: The AI Compiler team sits at the heart of how advanced AI models make it onto the car. We own the compiler that turns high‑level models into fast, reliable inference across GPUs powering GM’s next‑generation autonomous and assisted driving features. Our work spans graph lowering, operator coverage, kernel integration, and deployment tooling, with a mandate to squeeze every millisecond out of on‑vehicle workloads while preserving correctness and robustness in real‑world conditions. We partner closely with AI Deployments, AI Solutions, Runtime, and AI Kernels teams to co‑design a platform that enables new ideas in research to be quickly and safely shipped to production fleets. You’ll join a group of deep compiler, systems, and GPU engineers who enjoy working on hard problems, and diving into MLIR/ONNX and CUDA/TensorRT internals. We value clear thinking, strong engineering fundamentals, and a culture where people can do the best work of their careers on problems that directly shape the future of automated driving. The Role: As a Senior Compiler Engineer on the AI Kernels & Compilers team, you will work on the compilation stack that takes high‑level models and turns them into highly optimized inference artifacts running on GM’s autonomous and assisted driving platforms. You’ll be a key contributor to a pipeline and tooling that makes that path fast, reliable, and effortless for ML engineers across the AV organization to compile their models. You will work on an evolving a state-of-the-art model export and compilation pipeline—from capturing high‑level model graphs, through intermediate representations and compiler transforms, into accelerator‑specific inference engines and their integration with our runtime— so that we can simultaneously optimize compilation throughput, model fidelity, and on‑vehicle latency. Along the way, you’ll build robust tooling to validate numerical correctness, detect and bisect performance regressions, and surface clear, actionable diagnostics back to model authors. If you want to work at the intersection of compilers, performance engineering, and real‑world autonomy , this role puts your decisions directly on the critical path of what runs on the car.

Job Responsibility:

  • Build and evolve the model compilation toolchain used to deploy large‑scale perception, prediction, and planning models to the AV
  • Architect new compiler passes and analysis that improve build times, memory footprint, and runtime latency while preserving—or intentionally trading off—fidelity under strict safety and reliability constraints
  • Collaborate closely with kernels, runtime, and hardware teams to co‑design interfaces, shape accelerator capabilities, and ensure the compiler exposes the right abstractions to unlock peak performance on each platform
  • Set standards and best practices for model export, validation, and debugging so that AV teams can iterate quickly with clear, reproducible performance and accuracy characteristics

Requirements:

  • 3+ years of experience in the field of compilers
  • Experience with ML frameworks (e.g., PyTorch, TensorFlow, JAX) and software stack (e.g., ONNX, MLIR, XLA, TVM, TensorRT, etc)
  • Expertise in writing production quality Python/C++ code
  • Expertise in the software development life-cycle - coding, debugging, optimization, testing, integration
  • BS, or higher degree, in CS/CE/EE, or equivalent

Nice to have:

  • Experience building and optimizing ONNX‑based model export and deployment pipelines
  • GPU programming (CUDA) and familiarity with ML SW stack (e.g., cuDNN, cuBLAS)
  • Experience with ML accelerators and hardware architecture
  • Experience developing and deploying machine learning models
What we offer:
  • medical
  • dental
  • vision
  • Health Savings Account
  • Flexible Spending Accounts
  • retirement savings plan
  • sickness and accident benefits
  • life insurance
  • paid vacation & holidays
  • tuition assistance programs
  • employee assistance program
  • GM vehicle discounts

Additional Information:

Job Posted:
March 25, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior ML Compiler Engineer

Senior Research Engineer

We are seeking a highly skilled Senior Research Engineer to collaborate closely ...
Location
Location
United States
Salary
Salary:
210000.00 - 309000.00 USD / Year
assembly.ai Logo
Assembly
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong expertise in the Python ecosystem and major ML frameworks (PyTorch, JAX)
  • Experience with lower-level programming (C++ or Rust preferred)
  • Deep understanding of GPU acceleration (CUDA, profiling, kernel-level optimization)
  • TPU experience is a strong plus
  • Proven ability to accelerate deep learning workloads using compiler frameworks, graph optimizations, and parallelization strategies
  • Solid understanding of the deep learning lifecycle: model design, large-scale training, data processing pipelines, and inference deployment
  • Strong debugging, profiling, and optimization skills in large-scale distributed environments
  • Excellent communication and collaboration skills, with the ability to clearly prioritize and articulate impact-driven technical solutions
Job Responsibility
Job Responsibility
  • Investigate and mitigate performance bottlenecks in large-scale distributed training and inference systems
  • Develop and implement both low-level (operator/kernel) and high-level (system/architecture) optimization strategies
  • Translate research models and prototypes into highly optimized, production-ready inference systems
  • Explore and integrate inference compilers such as TensorRT, ONNX Runtime, AWS Neuron and Inferentia, or similar technologies
  • Design, test, and deploy scalable solutions for parallel and distributed workloads on heterogeneous hardware
  • Facilitate knowledge transfer and bidirectional support between Research and Engineering teams, ensuring alignment of priorities and solutions
What we offer
What we offer
  • competitive equity grants
  • 100% employer-paid benefits
  • flexibility of being fully remote
  • Fulltime
Read More
Arrow Right
New

Senior ML Ops Engineer - Architecture & Strategy

We own the platform blueprint for our ML infrastructure: designing systems that ...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
bmw.de Logo
BMW
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • University degree in Computer Science, Computer/Electrical Engineering or related subjects
  • 5–8+ years in ML platform or infrastructure engineering, with at least two years in a tech lead or architect role
  • Deep expertise in either AWS, Azure or Google cloud, ideally with multi-region or multi-account setups
  • Proven track record designing systems for PB-scale data and hundreds of concurrent training jobs as well as understanding of large vision models and the challenges of compressing them for automotive-grade SoCs
  • Strong knowledge of Kubernetes platform design, GitOps, and infrastructure-as-code
  • Excellent communication skills to align ML researchers, embedded engineers, data teams, and executives
  • Familiarity with edge model compilation toolchains for Qualcomm (QNN, AIMET) and/or NVIDIA (TensorRT, Triton) and experience with automotive data at scale, such as MDF4, MCAP, ROS bags, and multi-sensor synchronisation
Job Responsibility
Job Responsibility
  • You design the reference architecture for the ML platform end-to-end: data ingestion, PB-scale data lake, heterogeneous training clusters, model registry, and deployment-ready artefacts
  • You design the data-format backbone, setting standards for data flows, ingestion, cataloguing, transcoding, and partitioning at PB scale, integrated with dataset management tooling
  • You define the platform component topology and integration contracts for pipeline orchestration, experiment tracking, hyperparameter optimisation, dataset management, observability, and metadata
  • You establish model lifecycle governance, including experiment tracking, approval gates, validation criteria, and clear handoff contracts to deployment teams
  • You drive cost governance at PB scale, including accelerator spot strategies, S3 tiering, cross-AZ traffic reduction, and Kubernetes cluster right-sizing
  • You partner with Security, Legal, and Functional-Safety teams on ISO 26262, ISO 8800, and data-protection compliance
What we offer
What we offer
  • Challenging projects with which we shape the mobility of tomorrow together
  • Wide range of personal and professional development opportunities
  • Attractive, fair and performance-related remuneration
  • High level of job security
  • Annual special payments such as vacation pay, Christmas bonus, and profit sharing
  • Flexible working hours including six weeks annual leave and overtime compensation
  • Discounted BMW & MINI conditions
  • Many other benefits at bmw.jobs/benefits
  • Fulltime
Read More
Arrow Right
New

Software Engineer II and Senior Software Engineer - AI Compilers

The AI Frameworks team at Microsoft develops the AI software used to train and d...
Location
Location
United States , Mountain View
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Invent and implement innovative compiler features and advanced optimization passes, leveraging tools such as LLVM, MLIR, Torch Dynamo, and Triton
  • Develop code generation techniques for new hardware platforms
  • Design and develop cutting edge AI software in C++ and Python
  • Optimize AI workloads
  • Design new programming abstractions for AI
  • Collaborate broadly across multiple disciplines from hardware architects to ML developers
  • Identify requirements, plan and design solutions, estimate effort, and schedule deliverables
  • Help establish and drive the adoption of outstanding coding standards and patterns and help enhance our inclusive engineering culture
  • Embody Microsoft's culture and values
  • Fulltime
Read More
Arrow Right

Member of Technical Staff, Software Co-Design AI HPC Systems

Our team’s mission is to architect, co-design, and productionize next-generation...
Location
Location
United States , Mountain View
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
  • Strong background in one or more of the following areas: AI accelerator or GPU architectures
  • Distributed systems and large-scale AI training/inference
  • High-performance computing (HPC) and collective communications
  • ML systems, runtimes, or compilers
  • Performance modeling, benchmarking, and systems analysis
  • Hardware–software co-design for AI workloads
  • Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development.
  • Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders.
Job Responsibility
Job Responsibility
  • Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.
  • Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.
  • Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.
  • Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.
  • Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.
  • Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs.
  • Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams.
  • Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.
  • Fulltime
Read More
Arrow Right

Senior Runtime Engineer

We are building the next generation of large-scale AI systems that power trainin...
Location
Location
United States; Canada , Sunnyvale; Toronto
Salary
Salary:
Not provided
cerebras.net Logo
Cerebras Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience developing high-performance or distributed system software
  • Strong programming skills in C/C++, with expertise in multi-threading, memory management, and performance optimization
  • Experience with distributed systems, networking, or inter-process communication
  • Solid understanding of data structures, concurrency, and system-level resource management (CPU, I/O, and memory)
  • Proven ability to debug, profile, and optimize code across scales—from threads to clusters
  • Bachelor’s, Master’s, or equivalent experience in Computer Science, Electrical Engineering, or related field
Job Responsibility
Job Responsibility
  • Design and implement distributed runtime components to efficiently manage large-scale execution workloads
  • Develop and optimize high-performance data and communication pipelines that fully utilize CPU, memory, storage, and network resources
  • Enable scalable execution across multiple compute nodes, ensuring high concurrency and minimal bottlenecks
  • Collaborate closely with ML and compiler teams to integrate new model architectures, training regimes, and hardware-specific optimizations
  • Diagnose and resolve complex performance issues across the software stack using profiling and instrumentation tools
  • Contribute to overall system design, architecture reviews, and roadmap planning for large-scale AI workloads
What we offer
What we offer
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs
Read More
Arrow Right

Senior Full Stack LLM Engineer - Training

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. ...
Location
Location
United States; Canada , Sunnyvale; Toronto
Salary
Salary:
Not provided
cerebras.net Logo
Cerebras Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field
  • 5+ years of relevant industry experience (internship/co-op experience included)
  • Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
  • Strong debugging skills across performance, numerical accuracy, and runtime integration
  • Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion)
  • Proficiency in C/C++ programming and experience with low-level optimization
  • Proven experience in compiler development, particularly with LLVM and/or MLIR
  • Strong background in optimization techniques, particularly those involving NP-hard problems
Job Responsibility
Job Responsibility
  • Contribute to the end-to-end bring up of ML models on Cerebras CSX systems
  • Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning
  • Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization
  • Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups
What we offer
What we offer
  • Competitive salary and benefits package
  • Opportunities for professional growth and career advancement
  • A dynamic and innovative work environment
  • The chance to work on cutting-edge technologies and make a significant impact on the future of AI
Read More
Arrow Right

Senior Machine Learning Engineer

As a Machine Learning Engineer at Dedrone, you’ll play a pivotal role in advanci...
Location
Location
United States , Sterling
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience in modern C++ (C++14/17 or later), with strong object-oriented and generic programming skills
  • Deep understanding of multithreading and concurrency (threads, thread pools, locks, lock-free structures, atomics, futures, async patterns) and experience building robust, concurrent systems
  • Hands-on experience with parallel processing frameworks or patterns (SIMD, task-based parallelism, GPU offload, or similar) for real-time or high-throughput applications
  • Strong command of data structures and algorithms, and the ability to choose and implement the right structures for performance-critical, memory-constrained environments
  • Proven experience with memory management and performance optimization in C++ (stack vs heap, custom allocators, cache-aware design, avoiding fragmentation, RAII, move semantics)
  • Practical experience with CUDA (or similar GPU programming frameworks): writing kernels, managing GPU memory, optimizing for occupancy and bandwidth, and integrating with C++ codebases
  • Familiarity with Linux-based development (build systems like CMake, unit testing frameworks, containerization and/or cross-compilation for edge devices)
  • Strong debugging and profiling skills across CPU and GPU, and a methodical approach to benchmarking and regression testing
  • Excellent collaboration and communication skills, with a track record of working closely with research or ML teams to move algorithms from prototype to production
Job Responsibility
Job Responsibility
  • Design and implement high-performance C++ software that runs computer vision and tracking algorithms in real time on edge devices
  • Work closely with computer vision / self-supervised learning engineers to integrate their models into production pipelines, including pre/post-processing, I/O, and system orchestration
  • Build and optimize multithreaded and parallel processing pipelines for ingesting, synchronizing, and processing data from a networked system of cameras
  • Implement and tune CUDA kernels and GPU-accelerated components to maximize throughput and minimize latency for inference, tracking, and search
  • Design robust data structures and memory management strategies for handling large volumes of video, sensor, and metadata streams under tight compute and power constraints
  • Profile and optimize code using tools such as perf, valgrind, nvprof / Nsight, and similar to identify bottlenecks and improve CPU/GPU utilization
  • Collaborate with simulation and CV teams to deploy and evaluate algorithms in realistic test scenarios, including fault handling and performance monitoring
  • Develop clean, well-tested, and well-documented C++ libraries and services that can be reused across products and future airspace applications
  • Contribute to system-level architecture decisions, including inter-process communication, scheduling, resource allocation, and deployment strategies on edge platforms
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right
New

Groundskeeper

We’re Virtu Property — specialists in soft services for residential property. Ou...
Location
Location
United Kingdom , Fareham
Salary
Salary:
12.82 GBP / Hour
jobs.360resourcing.co.uk Logo
360 Resourcing Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience as a Groundskeeper with a track record of working in a similar role
  • Full valid U.K. driving licence
  • PA1 and PA6 certificate of competence in the use and handling of herbicides (desired but not essential)
Job Responsibility
Job Responsibility
  • Assist with day-to-day maintenance tasks and repairs set by line manager
  • Assess site for any specific issues that may have arisen and report back if appropriate
  • Work to a plan for each days’ schedule of works to ensure all necessary tasks are completed to a high standard within the time allowed
  • Litter pick external grounds
  • Complete all appropriate landscaping work
  • Ensure site is left tidy and looking its very best
What we offer
What we offer
  • 21 days’ holiday plus bank holidays (Pro Rata)
  • Full uniform provided
  • Access to a company vehicle and fuel card
  • Pension of up to 5% from the employee and 3% from the employer
  • Study support and access to our dedicated academy
  • 24/7 mental health services
  • Extra day off on your birthday
  • One volunteer day annually to support a charity or cause close to your heart
  • One online platform for all benefits and recognition
  • Parttime
Read More
Arrow Right