This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Research Internships at Microsoft provide a dynamic environment for research careers with a network of world-class research labs led by globally-recognized scientists and engineers, who pursue innovation in a range of scientific and technical disciplines to help solve complex challenges in diverse fields, including computing, healthcare, economics, and the environment. AI workloads are growing at an unprecedented pace, and inference has become one of the most critical challenges in modern computing. Large-scale models demand massive compute resources, and the diversity of hardware across cloud and edge adds complexity. Achieving low latency and high throughput while controlling cost requires rethinking the entire inference stack—from algorithms to infrastructure. Within our Systems Innovation research group, we pursue a full stack approach towards AI inference. We closely collaborate with multiple research teams and product groups across the globe. Some of the research problems we are currently working on are related to request scheduling/batching mechanisms, KV caching optimizations, LLM inference optimizations, and GPU fleet orchestration. We are looking for Research Interns to help advance the state of the art of systems for efficient AI. The ideal candidate will have background in systems for AI, including end-to-end AI inference pipelines, request scheduling and batching mechanisms, performance optimizations for AI inference, and KV caching mechanisms.
Job Responsibility:
Research Interns put inquiry and theory into practice
Learn, collaborate, and network for life
Advance their own careers and contribute to exciting research and development strides
Paired with mentors and expected to collaborate with other Research Interns and researchers
Present findings
Contribute to the vibrant life of the community
Requirements:
Accepted or currently enrolled in a PhD program in Computer Science, Software Engineering, Electrical Engineering, or a related STEM field
Experience with LLM architectures, systems for LLM inference, and/or AI hardware
Experience with GPUs and understanding of CUDA/ROCm frameworks
Experience with computer systems and/or networks
Experience in conducting research and writing peer-reviewed publications
Proficient written and verbal communication skills
Be able to work in a cross-functional and multi-disciplinary setting across research and product
Proficient software development skills, preferably in C++ and Python