Software Engineer, Systems ML - Frameworks / Compilers / Kernels Job at Meta (Menlo Park)

Software Engineer, Systems ML - Frameworks / Compilers / Kernels (PhD)

In this role, you will be a member of the MTIA (Meta Training & Inference Accele...

Location

United States , Bellevue

Salary:

181000.00 USD / Year ▼

Software Engineer, Systems ML - Compilers / Backend

We are seeking a software engineer to support the development of the compiler to...

Location

United States , Sunnyvale

Salary:

181000.00 USD / Year ▼

Software Engineer, Systems ML - Compilers / Backend

We are seeking a software engineer to support the development of the compiler to...

Location

United States , Sunnyvale

Salary:

217000.00 USD / Year ▼

Software Engineer, Hardware

As a software engineer on the Scaling team, you’ll help build and optimize the l...

Location

United States , San Francisco

Salary:

266000.00 - 455000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

Proficient in systems programming (e.g., Rust, C++) and scripting languages like Python
Experience in one or more of the following areas: compiler development, kernel authoring, accelerator programming, runtime systems, distributed systems, or high-performance simulation
Deep curiosity for how large-scale systems work and enjoy making them faster, simpler, and more reliable
Excited to work in a fast-paced, highly collaborative environment with evolving hardware and ML system demands
Value engineering excellence, technical leadership, and thoughtful system design

Job Responsibility

Design and build APIs and runtime components to orchestrate computation and data movement across heterogeneous ML workloads
Contribute to compiler infrastructure, including the development of optimizations and compiler passes to support evolving hardware
Engineer and optimize compute and data kernels, ensuring correctness, high performance, and portability across simulation and production environments
Profile and optimize system bottlenecks, especially around I/O, memory hierarchy, and interconnects, at both local and distributed scales
Develop simulation infrastructure to validate runtime behaviors, test training stack changes, and support early-stage hardware and system development
Rapidly deploy runtime and compiler updates to new supercomputing builds in close collaboration with hardware and research teams
Work across a diverse stack, primarily using Rust and Python, with opportunities to influence architecture decisions across the training framework

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

Software Engineer, Triton Compiler

As a Software Engineer, you will help build AI systems that achieve levels of pe...

Location

United States , San Francisco

Salary:

266000.00 - 445000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

3+ years of relevant engineering experience, ideally in systems, compilers, ML frameworks, or performance engineering
Owning problems end-to-end, including learning new hardware and software domains as needed

Job Responsibility

help build AI systems that achieve levels of performance that were previously impossible
designing and optimizing core ML systems
writing highly reliable low-level code
advancing the algorithms and infrastructure that power our models
design and build the compilers, languages, and high-performance kernels that allow researchers to fully exploit our first-party accelerators
advancing Triton and its backend
developing new compiler passes
creating the tooling needed to write fast, correct, and deeply optimized kernels for brand-new hardware
partner closely with the hardware team to unlock new capabilities and ensure our custom silicon can support the next generation of frontier models

What we offer

Offers Equity
Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth

Fulltime

Member of Technical Staff, Software Co-Design AI HPC Systems

Our team’s mission is to architect, co-design, and productionize next-generation...

Location

United States , Mountain View

Salary:

139900.00 - 274800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Strong background in one or more of the following areas: AI accelerator or GPU architectures
Distributed systems and large-scale AI training/inference
High-performance computing (HPC) and collective communications
ML systems, runtimes, or compilers
Performance modeling, benchmarking, and systems analysis
Hardware–software co-design for AI workloads
Proficiency in systems-level programming (e.g., C/C++, CUDA, Python) and performance-critical software development.
Proven ability to work across organizational boundaries and influence technical decisions involving multiple stakeholders.

Job Responsibility

Lead the co-design of AI systems across hardware and software boundaries, spanning accelerators, interconnects, memory systems, storage, runtimes, and distributed training/inference frameworks.
Drive architectural decisions by analyzing real workloads, identifying bottlenecks across compute, communication, and data movement, and translating findings into actionable system and hardware requirements.
Co-design and optimize parallelism strategies, execution models, and distributed algorithms to improve scalability, utilization, reliability, and cost efficiency of large-scale AI systems.
Develop and evaluate what-if performance models to project system behavior under future workloads, model architectures, and hardware generations, providing early guidance to hardware and platform roadmaps.
Partner with compiler, kernel, and runtime teams to unlock the full performance of current and next-generation accelerators, including custom kernels, scheduling strategies, and memory optimizations.
Influence and guide AI hardware design at system and silicon levels, including accelerator microarchitecture, interconnect topology, memory hierarchy, and system integration trade-offs.
Lead cross-functional efforts to prototype, validate, and productionize high-impact co-design ideas, working across infrastructure, hardware, and product teams.
Mentor senior engineers and researchers, set technical direction, and raise the overall bar for systems rigor, performance engineering, and co-design thinking across the organization.

Fulltime

Senior Machine Learning Engineer

As a Machine Learning Engineer at Dedrone, you’ll play a pivotal role in advanci...

Location

United States , Sterling

Salary:

Not provided

Axon

Expiration Date

Until further notice

Requirements

5+ years of professional experience in modern C++ (C++14/17 or later), with strong object-oriented and generic programming skills
Deep understanding of multithreading and concurrency (threads, thread pools, locks, lock-free structures, atomics, futures, async patterns) and experience building robust, concurrent systems
Hands-on experience with parallel processing frameworks or patterns (SIMD, task-based parallelism, GPU offload, or similar) for real-time or high-throughput applications
Strong command of data structures and algorithms, and the ability to choose and implement the right structures for performance-critical, memory-constrained environments
Proven experience with memory management and performance optimization in C++ (stack vs heap, custom allocators, cache-aware design, avoiding fragmentation, RAII, move semantics)
Practical experience with CUDA (or similar GPU programming frameworks): writing kernels, managing GPU memory, optimizing for occupancy and bandwidth, and integrating with C++ codebases
Familiarity with Linux-based development (build systems like CMake, unit testing frameworks, containerization and/or cross-compilation for edge devices)
Strong debugging and profiling skills across CPU and GPU, and a methodical approach to benchmarking and regression testing
Excellent collaboration and communication skills, with a track record of working closely with research or ML teams to move algorithms from prototype to production

Job Responsibility

Design and implement high-performance C++ software that runs computer vision and tracking algorithms in real time on edge devices
Work closely with computer vision / self-supervised learning engineers to integrate their models into production pipelines, including pre/post-processing, I/O, and system orchestration
Build and optimize multithreaded and parallel processing pipelines for ingesting, synchronizing, and processing data from a networked system of cameras
Implement and tune CUDA kernels and GPU-accelerated components to maximize throughput and minimize latency for inference, tracking, and search
Design robust data structures and memory management strategies for handling large volumes of video, sensor, and metadata streams under tight compute and power constraints
Profile and optimize code using tools such as perf, valgrind, nvprof / Nsight, and similar to identify bottlenecks and improve CPU/GPU utilization
Collaborate with simulation and CV teams to deploy and evaluate algorithms in realistic test scenarios, including fault handling and performance monitoring
Develop clean, well-tested, and well-documented C++ libraries and services that can be reused across products and future airspace applications
Contribute to system-level architecture decisions, including inter-process communication, scheduling, resource allocation, and deployment strategies on edge platforms

What we offer

Competitive salary and 401k with employer match
Discretionary paid time off
Paid parental leave for all
Medical, Dental, Vision plans
Fitness Programs
Emotional & Mental Wellness support
Learning & Development programs
Snacks in our offices

Fulltime

New

IT Training Lead

The IT Training Lead will drive technology learning and user adoption across the...

Location

United States , Delray Beach

Salary:

Not provided

Robert Half

Expiration Date

Until further notice

Requirements

Experience in IT training, instructional design, technical enablement, or learning and development
Strong knowledge of Microsoft 365
Excellent communication, facilitation, and content development skills
Ability to translate technical concepts into practical, user-friendly training.

Job Responsibility

Design, develop, and deliver IT training programs in instructor-led, virtual, and self-paced formats
Take lead in the Microsoft Copilot and AI training strategy, including onboarding, advanced use cases, responsible AI usage, and ongoing enablement
Partner with IT leadership to support new technology rollouts, system upgrades, and digital transformation initiatives
Create and maintain training content, including videos, guides, tutorials, and job aids
Identify skill gaps and develop targeted learning solutions to improve adoption and productivity
Gather feedback and measure training effectiveness to continuously improve programs.

Select Country

Software Engineer, Systems ML - Frameworks / Compilers / Kernels

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?