CrawlJobs Logo

Research Intern - AI Inference Architecture

United States, Redmond 6710.00 - 13270.00 USD / Month · Job Posted April 16, 2026
Apply Position
Job Link Share

Job Description

Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment. This Research Internship is an opportunity to work alongside world-class researchers and engineers to define the future generations of AI inference system architectures.

Job Responsibility

  • Research Interns put inquiry and theory into practice
  • learn, collaborate, and network for life
  • contribute to exciting research and development strides
  • paired with mentors and expected to collaborate with other Research Interns and researchers, present findings, and contribute to the vibrant life of the community

Requirements

  • Currently enrolled in a PhD program in Computer Science, Computer Engineering or a related STEM field
  • At least 1 year of experience working with LLM inference software stack and systems
  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship
  • submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples

Nice to have

  • Familiarity with computer architecture and system performance modeling
  • Experience with PyTorch, CUDA, and parallel programming
  • Demonstrated ability to develop original research agendas
  • Ability to think unconventionally and derive creative, innovative solutions
  • Proficient communication skills, both written and verbal
  • Proven interpersonal skills with the ability to work effectively across groups and cultures

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Research Intern - AI Inference Architecture

8 matching positions

Research Intern - AI Systems & Architecture

Research Internships at Microsoft provide a dynamic environment for research car...
Location
Location
United States , Mountain View
Salary
Salary:
6710.00 - 13270.00 USD / Month
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently enrolled in a PhD program in Computer Science, Electrical/Computer Engineering, or a related field
  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship
  • submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples
Job Responsibility
Job Responsibility
  • Investigate emerging AI system architectures and analyze how hardware, software, and model behavior interact across large-scale inference workloads
  • Develop and evaluate analytical or simulation-based performance models to identify system bottlenecks, scalability limits, and optimization opportunities
  • Prototype or assess new inference mechanisms, including disaggregated execution, sparse/expert model scaling, and hierarchical attention techniques
  • Explore next-generation accelerator, memory-architecture, and interconnect technologies, assessing their architectural trade-offs and cost implications
  • Conduct experiments, synthesize research findings, and communicate results to mentors and collaborating researchers
  • Collaborate with fellow interns and researchers to advance new ideas in AI systems and architectural design
  • Fulltime
Read More
Arrow Right

Research Intern - LLM Performance Optimization

Research Internships at Microsoft provide a dynamic environment for research car...
Location
Location
United States , Redmond
Salary
Salary:
6710.00 - 13270.00 USD / Month
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently enrolled in a PhD program in Computer Science or a related STEM field
  • At least 1 year of experience with Large Language Model architecture or inference performance optimization
  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship
  • submit a minimum of two reference letters
  • a cover letter
  • any relevant work or research samples
Job Responsibility
Job Responsibility
  • Research Interns put inquiry and theory into practice
  • learn, collaborate, and network for life
  • contribute to exciting research and development strides
  • paired with mentors
  • expected to collaborate with other Research Interns and researchers
  • present findings
  • contribute to the vibrant life of the community
  • Fulltime
Read More
Arrow Right

Research Intern - Systems For Efficient AI

Research Internships at Microsoft provide a dynamic environment for research car...
Location
Location
United States , Redmond
Salary
Salary:
6710.00 - 13270.00 USD / Month
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Accepted or currently enrolled in a PhD program in Computer Science, Software Engineering, Electrical Engineering, or a related STEM field
  • Experience with LLM architectures, systems for LLM inference, and/or AI hardware
  • Experience with GPUs and understanding of CUDA/ROCm frameworks
  • Experience with computer systems and/or networks
  • Experience in conducting research and writing peer-reviewed publications
  • Proficient written and verbal communication skills
  • Be able to work in a cross-functional and multi-disciplinary setting across research and product
  • Proficient software development skills, preferably in C++ and Python
Job Responsibility
Job Responsibility
  • Research Interns put inquiry and theory into practice
  • Learn, collaborate, and network for life
  • Advance their own careers and contribute to exciting research and development strides
  • Paired with mentors and expected to collaborate with other Research Interns and researchers
  • Present findings
  • Contribute to the vibrant life of the community
  • Fulltime
Read More
Arrow Right

Research Intern - Networking Research Group

Research Internships at Microsoft provide a dynamic environment for research car...
Location
Location
United States , Redmond
Salary
Salary:
6710.00 - 13270.00 USD / Month
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently enrolled in a Ph.D. program in Computer Science, Electrical Engineering or a related STEM field
  • Ability to think unconventionally to derive creative and innovative solutions
  • Have at least one year of experience with systems building
  • Experience in one of the following: network hardware, physical layer technologies, mobile systems and devices, network architecture, operations, and design, network security and privacy, AI/machine learning, analysis and optimization
  • Knowledge of large models, and experience training them at scale, or running inference is a plus
  • Ability to collaborate effectively with your mentor and other researchers
  • Research Interns are expected to be physically located in their manager’s Microsoft worksite location for the duration of their internship
  • Submit a minimum of two reference letters for this position as well as a cover letter and any relevant work or research samples
Job Responsibility
Job Responsibility
  • Research Interns put inquiry and theory into practice
  • Learn, collaborate, and network for life
  • Advance their own careers
  • Contribute to exciting research and development strides
  • Collaborate with other Research Interns and researchers
  • Present findings
  • Contribute to the vibrant life of the community
  • Fulltime
Read More
Arrow Right

Research Intern - AI Frameworks (Network Systems and Tools)

Research Internships at Microsoft provide a dynamic environment for research car...
Location
Location
United States , Redmond
Salary
Salary:
6710.00 - 13270.00 USD / Month
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently enrolled in a PhD program in Computer Science, Electrical/Computer Engineering, or a related field
  • Research experience in areas such as computer architecture, AI/ML systems, performance modeling, distributed systems, or hardware–software co-design
  • Programming skills in Python, C/C++ with experience building prototypes, simulators, or performance analysis tools
  • Familiarity with modern AI workloads and/or deep learning frameworks (e.g., PyTorch)
  • Demonstrated ability to define and pursue original research directions in AI systems or architecture
  • Ability to collaborate effectively with researchers across disciplines and work in cross-group, cross-cultural environments
  • Proficient communication and presentation skills for sharing complex technical insights
  • Ability to think creatively and approach system and architecture challenges with unconventional or innovative solutions
  • Experience with PyTorch, CUDA, Triton, or performance-simulation tools
  • Background in large-scale system design, AI inference bottleneck analysis, or modeling cost/performance tradeoffs
Job Responsibility
Job Responsibility
  • Investigate and evaluate emerging disaggregated KV cache architectures
  • Implement a hierarchical storage architecture with multiple tiers GPU Memory: Active working set of KV caches currently used by the model CPU DRAM: Hot cache for recently used KV chunks using pinned memory for efficient GPU-CPU transfers Local Storage: Large-scale local caching (NVMe, local disk)
  • Build Peer-to-Peer (P2P) service KV cache sharing architecture that enables direct, high-performance cache transfer between multiple LLM serving instances without requiring centralized cache servers
  • Fulltime
Read More
Arrow Right

Senior Research Engineer

As a Senior Research Engineer at Microsoft, you will advance Microsoft’s mission...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, Mathematics, Statistics, Physics, or a related field and 4 or more years in applied ML or AI research and product engineering
  • OR Master’s degree and 3 or more years in applied ML or AI research and product engineering
  • OR PhD in a relevant field and 2 or more years with generative AI, LLMs, or related ML algorithms
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Bringing State-of-the-Art Research to Products
  • Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML
  • Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments
  • Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
  • Drive original research and thought leadership (whitepapers, internal notes, patents)
  • convert insights into shipped capabilities
  • Research Translation: Continuously review emerging work
  • identify high-potential methods and adapt them to Microsoft problem spaces
  • End-to-End System Development
  • ML Design & Architecture: Own end-to-end pipeline from data prep, training, evaluation, deployment, and feedback loops
  • Fulltime
Read More
Arrow Right

Engineering Manager - Inference

We are looking for an Inference Engineering Manager to lead our AI Inference tea...
Location
Location
United States , San Francisco
Salary
Salary:
300000.00 - 385000.00 USD / Year
perplexity.ai Logo
Perplexity
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of engineering experience with 2+ years in a technical leadership or management role
  • Deep experience with ML systems and inference frameworks (PyTorch, TensorFlow, ONNX, TensorRT, vLLM)
  • Strong understanding of LLM architecture: Multi-Head Attention, Multi/Grouped-Query Attention, and common layers
  • Experience with inference optimizations: batching, quantization, kernel fusion, FlashAttention
  • Familiarity with GPU characteristics, roofline models, and performance analysis
  • Experience deploying reliable, distributed, real-time systems at scale
  • Track record of building and leading high-performing engineering teams
  • Experience with parallelism strategies: tensor parallelism, pipeline parallelism, expert parallelism
  • Strong technical communication and cross-functional collaboration skills
Job Responsibility
Job Responsibility
  • Lead and grow a high-performing team of AI inference engineers
  • Develop APIs for AI inference used by both internal and external customers
  • Architect and scale our inference infrastructure for reliability and efficiency
  • Benchmark and eliminate bottlenecks throughout our inference stack
  • Drive large sparse/MoE model inference at rack scale, including sharding strategies for massive models
  • Push the frontier with building inference systems to support sparse attention, disaggregated pre-fill/decoding serving, etc.
  • Improve the reliability and observability of our systems and lead incident response
  • Own technical decisions around batching, throughput, latency, and GPU utilization
  • Partner with ML research teams on model optimization and deployment
  • Recruit, mentor, and develop engineering talent
What we offer
What we offer
  • Equity
  • Health
  • Dental
  • Vision
  • Retirement
  • Fitness
  • Commuter and dependent care accounts
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, AI & System Co-Design

The AI System SW/HW Co-design team’s mission is to explore, develop, and help pr...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining a PhD degree in the field of Computer Science or a related STEM field
  • Knowledge of Hardware Architecture and Distributed systems with interest in one or more of High Performance Computing, Numerics, Performance, and AI hardware including compute, networking, and storage
  • 2+ years experience in one or more of High Performance Computing, Numerics, Performance and AI hardware including compute, networking and storage
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Lead and support research that accelerates ML applications over one or more of software, system and accelerator architectures, optimizing training and/or inference of next generation AI workloads here at Meta
  • Work towards long-term ambitious research goals, while identifying intermediate milestones
  • Lead and collaborate on research projects with other researchers and engineers across diverse disciplines
  • Communicate research agenda, progress and results
  • Influence progress of relevant research communities by producing publications
Read More
Arrow Right