Research Scientist Intern, PyTorch Distributed Job at Meta (Menlo Park)

Research Scientist Intern, PyTorch Compiler

Our team makes PyTorch run faster and more resource-efficient without sacrificin...

Location

United States , Menlo Park

Salary:

7650.00 - 12134.00 USD / Month

Meta

Expiration Date

Until further notice

Requirements

Currently has or is in the process of obtaining a PhD degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
Experience in ML compiler, Distributed Training, ML systems, or similar
Proficient in Python or Cuda programming
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment

Job Responsibility

Develop new techniques in TorchDynamo, TorchInductor, PyTorch core, PyTorch Distributed
Explore the intersection of PyTorch compiler and PyTorch Distributed
Optimize Generative AI models across the stack (pre-training, fine-tuning, and inference)
Improve general PyTorch performance
Conduct cutting-edge research on ML compiler and ML distributed technologies
Collaborate with users of PyTorch to enable new use cases for the framework both inside and outside Meta

Research Scientist Intern, PyTorch Framework Performance

Our team’s mission is to make PyTorch models high-performing, deterministic and ...

Location

United States , Bellevue

Salary:

7650.00 - 12134.00 USD / Month

Meta

Expiration Date

Until further notice

Requirements

Currently has, or is in the process of obtaining, a PhD degree in the field of Computer Science or a related STEM field
Deep knowledge of transformer architectures, including attention, feed-forward layers, and Mixture-of-Experts (MoE) models
Strong background in ML systems research, with domain knowledge in MoE efficiency, such as routing, expert parallelism, communication overheads, and kernel-level optimizations
Hands-on experience writing GPU kernels using CUDA and/or cuteDSL
Working knowledge of quantization techniques and their impact on performance and accuracy
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment

Job Responsibility

Design and evaluate communication-aware, kernel-aware, and quantization-aware MoE execution strategies, combining ideas such as expert placement, routing, batching, scheduling, and precision selection
Develop and optimize GPU kernels and runtime components for MoE workloads, including fused kernels, grouped GEMMs, memory-efficient forward and backward passes
Explore quantization techniques (e.g., MXFP8, FP8) in the context of MoE, balancing accuracy, performance, and hardware efficiency
Build performance models and benchmarks to analyze compute, memory, communication, and quantization overheads across different sparsity regimes
Run experiments on single-node and multi-node GPU systems
Collaborate with the open-source community to gather feedback and iterate on the project
Contribute to PyTorch (Core, Compile, Distributed) within the scope of the project
Improve PyTorch performance in general

Research Scientist Intern, Multimodal and Multitasking Machine Learning

Meta Reality Labs Research is looking for upcoming scientists and researchers wi...

Location

United States , Redmond

Salary:

7313.00 - 12134.00 USD / Month

Meta

Expiration Date

Until further notice

Requirements

Currently has, or is in the process of obtaining a PhD in the fields of Computer Science, Electrical Engineering, or related field
Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
2+ years research experience in one or more of the following: developing machine learning and computer vision models, optimization of edge computing algorithms, or distributed compute architectures
2+ years experience programming in Python/C++
Experience with Deep Learning frameworks (Pytorch, TensorFlow, etc)

Job Responsibility

Research on design / model / execution of efficient ML algorithms
Research on novel ML or computational imaging algorithms for applications and optimize existing algorithms
Research on development and optimization of edge computing algorithms (ML and non-ML)
Collaboration with and support of other researchers across various disciplines
Communication of research agenda, progress and results
Prototyping, building and characterizing experimental systems and custom hardware

Research Scientist Intern, Multimodal and Multitasking Machine Learning

Meta Reality Labs Research is looking for upcoming scientists and researchers wi...

Location

United States , Redmond

Salary:

7313.00 - 12134.00 USD / Month

Meta

Expiration Date

Until further notice

Requirements

Currently has, or is in the process of obtaining a PhD in the fields of Computer Science, Electrical Engineering, or related field
Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
2+ years research experience in one or more of the following: developing machine learning and computer vision models, optimization of edge computing algorithms, or distributed compute architectures
2+ years experience programming in Python/C++
Experience with Deep Learning frameworks (Pytorch, TensorFlow, etc)

Job Responsibility

Research on design / model / execution of efficient ML algorithms
Research on novel ML or computational imaging algorithms for applications and optimize existing algorithms
Research on development and optimization of edge computing algorithms (ML and non-ML)
Collaboration with and support of other researchers across various disciplines
Communication of research agenda, progress and results
Prototyping, building and characterizing experimental systems and custom hardware

Research Scientist Intern, AI & Compute Foundation - MTIA Software

The MTIA (Meta Training & Inference Accelerator) Software team is part of the AI...

Location

Canada , Toronto

Salary:

6240.00 - 10334.00 CAD / Month

Meta

Expiration Date

Until further notice

Requirements

Currently has, or is in the process of obtaining, PhD degree in the field of Computer Science or a related STEM field
C/C++ programming skills
Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment
Knowledge of Computer Architecture and Distributed systems with interest in one or more of High Performance Computing, Numerics, Performance and AI hardware including compute, networking and storage

Job Responsibility

Development of Software stack with one of the following core focus areas: AI frameworks, compiler stack, high performance kernel development and acceleration onto next generation of hardware architectures
Contribute to the development of the industry-leading PyTorch AI framework core compilers to support new state of the art inference and training AI hardware accelerators and optimize their performance
Analyze deep learning networks, develop & implement compiler optimization algorithms
Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc
Performance tuning and optimizations of deep learning framework & software components

Applied AI/ML Scientist

As an Applied AI Scientist in the FieldML team, you will be responsible for deve...

Location

United Arab Emirates

Salary:

Not provided

Cerebras Systems

Expiration Date

Until further notice

Requirements

Master’s or PhD in Computer Science, Machine Learning, or related fields
Expert-level understanding of modern model architectures, including dense transformers, MoEs, multimodal and sequence models, scaling laws and training dynamics
Proven track record of training and/or fine-tuning large models (1B+ parameters) and direct experience with the challenges of large-scale model training
Mastery of Python and PyTorch, experience with distributed training frameworks and large-scale distributed data processing pipelines and tools
Strong interpersonal and communication skills
Effective in collaborative and fast-paced team settings, able to work autonomously and within a team in a dynamic environment, managing multiple projects and pivoting as customer needs evolve

Job Responsibility

Customer Use Case Discovery & Project Scoping
Collaborate with customer stakeholders to identify the best approaches to their business problem with AI
Contribute to the technical scoping of engagements, including feasibility analysis, data quality/availability/readiness assessments, and the selection of optimal model architectures
Define project milestones, success metrics, and rigorous evaluation benchmarks
Custom SOTA Models and AI Systems Development
Architect and execute end-to-end training recipes for custom models, tailoring model architecture and training recipes to meet customer-specific performance and accuracy requirements
Design and implement sophisticated adaptation strategies, including continuous pre-training on private datasets, supervised fine-tuning (SFT), and post-training alignment via RLHF or DPO
Take full ownership of the training pipeline, from high-performance data preprocessing and tokenization to hyperparameter tuning and loss-curve analysis
Navigate the nuances of model convergence on specialized hardware
Scale training workloads across Cerebras clusters

What we offer

Build a breakthrough AI platform beyond the constraints of the GPU
Publish and open source their cutting-edge AI research
Work on one of the fastest AI supercomputers in the world
Enjoy job stability with startup vitality
Our simple, non-corporate work culture that respects individual beliefs

Research Scientist Intern, Embodied Foundation Models (Evaluation)

Our team is seeking a talented Applied Scientist Intern to join us for 3-6 month...

Location

United States , Sunnyvale

Salary:

Not provided

Wayve

Expiration Date

Until further notice

Requirements

You are currently pursuing a graduate degree in a Computer Science, Machine Learning, Robotics, or related technical field
You are proficient in at least one backend/systems programming language (e.g. Python, Ruby, Java, etc)
You have previous experience in vision-language models, large language models, natural language processing, especially around reasoning
You have solid software engineering fundamentals, especially in Python
You have previously used PyTorch or a similar library for deep learning (e.g. Tensorflow, JAX)
Experience with multi-node distributed training of large models
You are interested in using large-scale multimodal (vision, language, etc.) datasets to improve embodied AI
You have previous publications in the following conferences (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)

Job Responsibility

Work on foundation models for embodied AI, including large-scale pretraining, post-training, leveraging language, or improving reasoning capabilities
Train models on large-scale multimodal (vision, language, etc.) data efficiently in a multi-node distributed system, and evaluate their performance on open (and closed) datasets/benchmarks
Lead a high-impact research work and publish at a top tier conference (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)

Research Scientist Intern, Embodied Foundation Models

Our team is seeking a talented Applied Scientist Intern to join us for 3-6 month...

Location

United States , Sunnyvale

Salary:

Not provided

Wayve

Expiration Date

Until further notice

Requirements

Currently pursuing a graduate degree in Computer Science, Machine Learning, Robotics, or related technical field
Proficient in at least one backend/systems programming language (e.g. Python, Ruby, Java, etc)
Previous experience in vision-language models, large language models, natural language processing, especially around reasoning
Solid software engineering fundamentals, especially in Python
Previously used PyTorch or a similar library for deep learning (e.g. Tensorflow, JAX)
Experience with multi-node distributed training of large models
Interested in using large-scale multimodal (vision, language, etc.) datasets to improve embodied AI
Previous publications in conferences (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)

Job Responsibility

Work on foundation models for embodied AI, including large-scale pretraining, post-training, leveraging language, or improving reasoning capabilities
Train models on large-scale multimodal (vision, language, etc.) data efficiently in a multi-node distributed system, and evaluate their performance on open (and closed) datasets/benchmarks
Lead a high-impact research work and publish at a top tier conference

Select Country

Research Scientist Intern, PyTorch Distributed

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?