ML Compiler and Performance Engineer Job at Meta (Bellevue)

Software Engineer, Triton Compiler

As a Software Engineer, you will help build AI systems that achieve levels of pe...

Location

United States , San Francisco

Salary:

266000.00 - 445000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

3+ years of relevant engineering experience, ideally in systems, compilers, ML frameworks, or performance engineering
Owning problems end-to-end, including learning new hardware and software domains as needed

Job Responsibility

help build AI systems that achieve levels of performance that were previously impossible
designing and optimizing core ML systems
writing highly reliable low-level code
advancing the algorithms and infrastructure that power our models
design and build the compilers, languages, and high-performance kernels that allow researchers to fully exploit our first-party accelerators
advancing Triton and its backend
developing new compiler passes
creating the tooling needed to write fast, correct, and deeply optimized kernels for brand-new hardware
partner closely with the hardware team to unlock new capabilities and ensure our custom silicon can support the next generation of frontier models

What we offer

Offers Equity
Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth

Fulltime

Software Engineer - Performance Tools

Join our team as a Software Engineer - Performance Tools and take the lead in il...

Location

United States , San Jose

Salary:

150000.00 - 275000.00 USD / Year

Etched

Expiration Date

Until further notice

Requirements

Strong proficiency in C++ or Rust
Proficiency in Python is a plus
Deep understanding of computer architecture (CPU, GPU, accelerators), memory hierarchies (caches, DRAM), and interconnects (especially PCIe)
Proven experience in low-level performance analysis, profiling, and bottleneck identification on complex hardware systems (GPUs, CPUs, FPGAs, or custom ASICs)
Experience with performance analysis tools (e.g., NVIDIA Nsight, AMD uProf, Intel VTune, perf, Tracy, ETW)
Experience working close to hardware, potentially reading performance counters or interacting directly with device drivers

Job Responsibility

Tool Architecture & Design: Lead the design and architecture of a comprehensive performance analysis suite, including data collection mechanisms, data processing pipelines, analysis engines, and user interfaces (CLI and/or GUI)
Low-Level Data Collection: Develop robust methods to capture performance data directly from our custom ML accelerator hardware (e.g., hardware performance counters, execution unit status, memory access patterns) via driver interfaces or other mechanisms
Host & System Tracing: Implement tracing for host-side API calls (runtime libraries, driver interactions) and system-level events (CPU activity, PCIe traffic, memory usage, network contention) related to Sohu workloads
Data Correlation & Synchronization: Design and implement techniques to accurately correlate performance events across the host CPU, device driver, PCIe bus, multiple accelerators, and multiple hosts, ensuring precise time synchronization
Performance Analysis Engine: Build analysis modules to automatically interpret collected trace and counter data, identifying key performance limiters (e.g., compute-bound, memory bandwidth-bound, latency-bound, PCIe-bound, specific hardware bottlenecks)
Visualization & Reporting: Develop intuitive visualizations (timelines, dependency graphs, resource utilization charts, statistical summaries) to clearly communicate performance characteristics and bottlenecks to users
Collaboration & Support: Work closely with hardware architects, firmware engineers, driver developers, compiler engineers, and ML application engineers to understand their needs, define tool requirements, and provide expert guidance on performance analysis and optimization using the tool

What we offer

Medical, dental, and vision packages with generous premium coverage
$500 per month credit for waiving medical benefits
Housing subsidy of $2k per month for those living within walking distance of the office
Relocation support for those moving to San Jose (Santana Row)
Various wellness benefits covering fitness, mental health, and more
Daily lunch + dinner in our office

Fulltime

LLM Inference Performance & Evals Engineer

Join the inference model team dedicated to bring up the state-of-the-art models,...

Location

Canada , Toronto

Salary:

Not provided

Cerebras Systems

Expiration Date

Until further notice

Requirements

3+ years building high-performance ML or systems software
Solid grounding in Transformer math—attention scaling, KV-cache, quantisation—or clear evidence you learn this material rapidly
Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
Strong debugging skills across performance, numerical accuracy, and runtime integration
Prior experience in modeling, compilers or crafting benchmarks or performance studies
not just black-box QA tests
Strong passion to leverage AI agents or workflow orchestration tools to boost personal productivity

Job Responsibility

Prototype and benchmark cutting-edge ideas: new attentions, MoE, speculative decoding, and many more innovations as they emerge
Develop agent-driven automation that designs experiments, schedules runs, triages regressions, and drafts pull-requests
Work closely with compiler, runtime, and silicon teams: unique opportunity to experience the full stack of software/hardware innovation
Keep pace with the latest open- and closed-source models
run them first on wafer scale to expose new optimization opportunities

What we offer

Build a breakthrough AI platform beyond the constraints of the GPU
Publish and open source their cutting-edge AI research
Work on one of the fastest AI supercomputers in the world
Enjoy job stability with startup vitality
Our simple, non-corporate work culture that respects individual beliefs

Software Engineering Manager, Programming Languages and Runtimes (PL&R) Compilers

Meta’s Server LLVM team owns the C++ optimizing compiler that builds the majorit...

Location

United States , Bellevue

Salary:

184000.00 - 257000.00 USD / Year

Senior ML Compiler Engineer

About the Mission: GM’s vision of Zero Crashes, Zero Emissions, and Zero Congest...

Location

United States , Austin

Salary:

128700.00 - 261300.00 USD / Year

General Motors

Expiration Date

Until further notice

Requirements

3+ years of experience in the field of compilers
Experience with ML frameworks (e.g., PyTorch, TensorFlow, JAX) and software stack (e.g., ONNX, MLIR, XLA, TVM, TensorRT, etc)
Expertise in writing production quality Python/C++ code
Expertise in the software development life-cycle - coding, debugging, optimization, testing, integration
BS, or higher degree, in CS/CE/EE, or equivalent

Job Responsibility

Build and evolve the model compilation toolchain used to deploy large‑scale perception, prediction, and planning models to the AV
Architect new compiler passes and analysis that improve build times, memory footprint, and runtime latency while preserving—or intentionally trading off—fidelity under strict safety and reliability constraints
Collaborate closely with kernels, runtime, and hardware teams to co‑design interfaces, shape accelerator capabilities, and ensure the compiler exposes the right abstractions to unlock peak performance on each platform
Set standards and best practices for model export, validation, and debugging so that AV teams can iterate quickly with clear, reproducible performance and accuracy characteristics

What we offer

medical
dental
vision
Health Savings Account
Flexible Spending Accounts
retirement savings plan
sickness and accident benefits
life insurance
paid vacation & holidays
tuition assistance programs

Fulltime

Staff ML Compiler Engineer

As a Staff Compiler Engineer on the AI Kernels & Compilers team, you will own th...

Location

United States , Austin

Salary:

185100.00 - 335300.00 USD / Year

General Motors

Expiration Date

Until further notice

Requirements

5+ years of experience in the field of compilers
Experience with ML frameworks (e.g., PyTorch, TensorFlow, JAX) and software stack (e.g., ONNX, MLIR, XLA, TVM, TensorRT, etc)
Expertise in writing production quality Python/C++ code
Expertise in the software development life-cycle - coding, debugging, optimization, testing, integration
BS, or higher degree, in CS/CE/EE, or equivalent

Job Responsibility

Own and evolve the model compilation toolchain used to deploy large‑scale perception, prediction, and planning models to the AV
Architect new compiler passes and analysis that improve build times, memory footprint, and runtime latency while preserving—or intentionally trading off—fidelity under strict safety and reliability constraints
Collaborate closely with kernels, runtime, and hardware teams to co‑design interfaces, shape accelerator capabilities, and ensure the compiler exposes the right abstractions to unlock peak performance on each platform
Set standards and best practices for model export, validation, and debugging so that AV teams can iterate quickly with clear, reproducible performance and accuracy characteristics

What we offer

medical
dental
vision
Health Savings Account
Flexible Spending Accounts
retirement savings plan
sickness and accident benefits
life insurance
paid vacation & holidays
tuition assistance programs

Fulltime

Research Scientist Intern, PyTorch Compiler

Our team makes PyTorch run faster and more resource-efficient without sacrificin...

Location

United States , Menlo Park

Salary:

7650.00 - 12134.00 USD / Month

Software Engineer, Infra PyTorch (PhD)

This role is about developing the core PyTorch 2.0 technologies, innovating and ...

Location

United States , Menlo Park

Salary:

181000.00 USD / Year ▼

Select Country

ML Compiler and Performance Engineer

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?