CrawlJobs Logo

Research Scientist Intern, PyTorch Compiler

United States, Menlo Park 7650.00 - 12134.00 USD / Month · Job Posted January 29, 2026
Apply Position
Job Link Share

Job Description

Our team makes PyTorch run faster and more resource-efficient without sacrificing the flexibility and ease of use of PyTorch. Team scope: - Advance PyTorch 2.0 technologies that bring torch.compile() to the heart of PyTorch - Advance PyTorch Distributed through torch.compile() - Improve PyTorch out-of-the-box performance on GPU, CPU, accelerators - Vertical performance optimization for models for training and inference Our team at Meta AI offers twelve (12) to sixteen (24) weeks long internships and we have various start dates throughout the year.

Job Responsibility

  • Develop new techniques in TorchDynamo, TorchInductor, PyTorch core, PyTorch Distributed
  • Explore the intersection of PyTorch compiler and PyTorch Distributed
  • Optimize Generative AI models across the stack (pre-training, fine-tuning, and inference)
  • Improve general PyTorch performance
  • Conduct cutting-edge research on ML compiler and ML distributed technologies
  • Collaborate with users of PyTorch to enable new use cases for the framework both inside and outside Meta

Requirements

  • Currently has or is in the process of obtaining a PhD degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Experience in ML compiler, Distributed Training, ML systems, or similar
  • Proficient in Python or Cuda programming
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment

Nice to have

  • Experience working on other ML compiler stack, especially on PT2 stack or Triton
  • Experience doing performance optimization on machine learning models
  • Intent to return to degree program after the completion of the internship/co-op
  • Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at leading workshops or conferences such as NeurIPS, MLSys, ASPLOS, PLDI, CGO, PACT, ICML, or similar
  • Experience working and communicating cross functionally in a team environment

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Research Scientist Intern, PyTorch Compiler

8 matching positions

Research Scientist Intern, PyTorch Framework Performance

Our team’s mission is to make PyTorch models high-performing, deterministic and ...
Location
Location
United States , Bellevue
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in the field of Computer Science or a related STEM field
  • Deep knowledge of transformer architectures, including attention, feed-forward layers, and Mixture-of-Experts (MoE) models
  • Strong background in ML systems research, with domain knowledge in MoE efficiency, such as routing, expert parallelism, communication overheads, and kernel-level optimizations
  • Hands-on experience writing GPU kernels using CUDA and/or cuteDSL
  • Working knowledge of quantization techniques and their impact on performance and accuracy
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Design and evaluate communication-aware, kernel-aware, and quantization-aware MoE execution strategies, combining ideas such as expert placement, routing, batching, scheduling, and precision selection
  • Develop and optimize GPU kernels and runtime components for MoE workloads, including fused kernels, grouped GEMMs, memory-efficient forward and backward passes
  • Explore quantization techniques (e.g., MXFP8, FP8) in the context of MoE, balancing accuracy, performance, and hardware efficiency
  • Build performance models and benchmarks to analyze compute, memory, communication, and quantization overheads across different sparsity regimes
  • Run experiments on single-node and multi-node GPU systems
  • Collaborate with the open-source community to gather feedback and iterate on the project
  • Contribute to PyTorch (Core, Compile, Distributed) within the scope of the project
  • Improve PyTorch performance in general
Read More
Arrow Right

Research Scientist Intern, PyTorch On-Device

PyTorch is Meta’s deep learning framework for fast, flexible AI/ML experimentati...
Location
Location
United States , Bellevue
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD degree in the field of Computer Science or a related STEM field
  • Experience in ML compilers, sparsity, quantization, kernel development, or similar as applied to on-device and highly-constrained environments
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Develop new or apply existing performance techniques to on-device AI
  • Explore quantization, sparsity, and model/software co-design as solutions
  • Apply knowledge and research to advance the state-of-the-art in on-device machine learning frameworks
  • Collaborate with users and developers of PyTorch and ExecuTorch to enable new use cases inside and outside Meta
Read More
Arrow Right

Research Scientist Intern, AI & Compute Foundation - MTIA Software

The MTIA (Meta Training & Inference Accelerator) Software team is part of the AI...
Location
Location
Canada , Toronto
Salary
Salary:
6240.00 - 10334.00 CAD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD degree in the field of Computer Science or a related STEM field
  • C/C++ programming skills
  • Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment
  • Knowledge of Computer Architecture and Distributed systems with interest in one or more of High Performance Computing, Numerics, Performance and AI hardware including compute, networking and storage
Job Responsibility
Job Responsibility
  • Development of Software stack with one of the following core focus areas: AI frameworks, compiler stack, high performance kernel development and acceleration onto next generation of hardware architectures
  • Contribute to the development of the industry-leading PyTorch AI framework core compilers to support new state of the art inference and training AI hardware accelerators and optimize their performance
  • Analyze deep learning networks, develop & implement compiler optimization algorithms
  • Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc
  • Performance tuning and optimizations of deep learning framework & software components
Read More
Arrow Right

Research Scientist Intern, PyTorch Distributed

Meta is seeking a Research Scientist Intern to join our Meta PyTorch Distributed...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, PhD degree in the field of Computer Science or a related STEM field
  • Experience in one or more of the following machine learning/deep learning domains: Large scale training and inference ML Systems Research, ML theory: Basic knowledge about ML models in different modalities like LLM (Large Language Models), Vision (VITS, MVITS) and Multimodal and how scale impacts performance, ML systems: AI infrastructure, machine learning accelerators, high performance computing, machine learning compilers, GPU architecture, machine learning frameworks, distributed systems, on-device optimization
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Apply relevant AI and machine learning techniques to advance the state-of-the-art in machine learning frameworks
  • Collaborate with users of PyTorch to enable new use cases for the framework both inside and outside Meta
  • Develop novel, accurate AI algorithms and advanced systems for large scale distributed training and inference
  • Leverage graph-based and compiler-based technologies to optimize distributed training and distributed inference use-cases
Read More
Arrow Right

Research Scientist Intern, MSL Infra Kernels & Optimizations

Meta’s Meta SuperIntelligence Labs (MSL) Infra Kernels & Optimizations (K&O) tea...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in the field of Computer Science, Computer Vision, Generative AI, NLP, relevant technical field, or equivalent practical experience
  • Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment
  • Specialized experience in one or more of the following areas: Accelerators/GPU architectures, High Performance Computing (HPC), Machine Learning Compilers, Training/Inference ML Systems, Model Compression, Communication Collectives, ML Kernels/Operator optimizations, Machine learning frameworks (e.g. PyTorch) and SW/HW co-design
  • Experience developing AI-System infrastructure or AI algorithms in C/C++ or Python
Job Responsibility
Job Responsibility
  • Explore, prototype and productionize highly optimized ML kernels to unlock full potential of current and future accelerators for Meta’s AI workloads. Open source SOTA implementations as applicable
  • Explore, co-design and optimize parallelisms, compute efficiency, distributed training/inference paradigms and algorithms to improve the scalability, efficiency and reliability of inference and large-scale training systems
  • Optimize inference and training communications performance at scale and investigate improvements to algorithms, tooling, and interfaces, working across multiple accelerator types and HPC collective communication libraries such as NCCL, RCCL, UCC and MPI
  • Innovate and co-design novel model architectures for sustained scaling and hardware efficiency during training and inference
  • Benchmark, analyze, model and project the performance of AI workloads against a wide range of what-if scenarios and provide early input to the design of future hardware, models and runtime, giving crucial feedback to the architecture, compiler, kernel, modeling and runtime teams
  • Explore, co-design and productionize model compression techniques such as Quantization, Pruning, Distillation and Sparsity to improve training and inference efficiency
  • Collaborate with AI & Systems Co-design to guide Meta’s AI HW strategy
Read More
Arrow Right

Research Scientist Intern, MSL Infra Kernels & Optimizations

Meta’s Meta SuperIntelligence Labs (MSL) Infra Kernels & Optimizations (K&O) tea...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in the field of Computer Science, Computer Vision, Generative AI, NLP, relevant technical field, or equivalent practical experience
  • Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment
  • Specialized experience in one or more of the following areas: Accelerators/GPU architectures, High Performance Computing (HPC), Machine Learning Compilers, Training/Inference ML Systems, Model Compression, Communication Collectives, ML Kernels/Operator optimizations, Machine learning frameworks (e.g. PyTorch) and SW/HW co-design
  • Experience developing AI-System infrastructure or AI algorithms in C/C++ or Python
Job Responsibility
Job Responsibility
  • Explore, prototype and productionize highly optimized ML kernels to unlock full potential of current and future accelerators for Meta’s AI workloads. Open source SOTA implementations as applicable
  • Explore, co-design and optimize parallelisms, compute efficiency, distributed training/inference paradigms and algorithms to improve the scalability, efficiency and reliability of inference and large-scale training systems
  • Optimize inference and training communications performance at scale and investigate improvements to algorithms, tooling, and interfaces, working across multiple accelerator types and HPC collective communication libraries such as NCCL, RCCL, UCC and MPI
  • Innovate and co-design novel model architectures for sustained scaling and hardware efficiency during training and inference
  • Benchmark, analyze, model and project the performance of AI workloads against a wide range of what-if scenarios and provide early input to the design of future hardware, models and runtime, giving crucial feedback to the architecture, compiler, kernel, modeling and runtime teams
  • Explore, co-design and productionize model compression techniques such as Quantization, Pruning, Distillation and Sparsity to improve training and inference efficiency
  • Collaborate with AI & Systems Co-design to guide Meta’s AI HW strategy
Read More
Arrow Right
New

Team Member

We're Popeyes UK&I, a Times Top 100 ranked company, great place to work and Happ...
Location
Location
United Kingdom , London
Salary
Salary:
8.00 - 12.71 GBP / Hour
jobs.360resourcing.co.uk Logo
360 Resourcing Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A positive attitude and a passion for making people's day
  • A team player who brings energy and enthusiasm
  • No experience needed
Job Responsibility
Job Responsibility
  • Serve our famous shatter crunch chicken with energy and pride
  • Deliver unforgettable guest experiences every shift
  • Master different stations with full training provided
  • Be part of a supportive, fun-loving team
What we offer
What we offer
  • Flexible scheduling to suit your lifestyle
  • Free chicken on shift + 30% off when you're not working
  • Paid day off on your birthday
  • Clear career progression opportunities
  • 28 days holiday (pro rata)
  • Access up to 30% of your pay early with Wage Stream
  • Enhanced parental leave
  • Pension contributions
  • Gym and cycle-to-work discounts
  • Tech scheme & online perks platform
  • Fulltime
Read More
Arrow Right
New

Family Law Litigation Attorney-Boutique Firm

A well-established solo practitioner is seeking a litigation associate to join a...
Location
Location
United States , Butler
Salary
Salary:
Not provided
bhsg.com Logo
Beacon Hill
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Licensed in Pennsylvania
  • At least 1+ years of litigation experience
  • Strong writing and advocacy skills
Job Responsibility
Job Responsibility
  • Family law litigation
  • Exposure to criminal law and probate matters
  • Fulltime
Read More
Arrow Right