Member of Technical Staff, LLM Inference - MAI Superintelligence Team Job at Microsoft Corporation (Mountain View)

Job Description

Our Inference team is responsible for building and maintaining the tools and systems that enable Microsoft AI researchers to run models easily and efficiently. Our work empowers researchers to run models in RL, synthetic data generation, evals, and more. We are joint stewards of one of the largest compute fleets in the world. The team is responsible for optimizing compute efficiency on our heterogeneous data centers as well as enabling cutting-edge research and production deployment. We are an applied research team that is embedded directly in Microsoft AI’s research org to work as closely as possible with researchers. We are vertically integrated, owning everything from kernels to architecture co-design to distributed systems to profiling and testing tools.

Job Responsibility

Work alongside researchers and engineers to implement frontier AI research ideas
Introduce new systems, tools, and techniques to improve model inference performance
Build tools to help debug performance bottlenecks, numeric instabilities, and distributed systems issues
Build tools and establish processes to enhance the team’s collective productivity
Find ways to overcome roadblocks and deliver your work to users quickly and iteratively
Enjoy working in a fast-paced, design-driven product development cycle
Embody our Culture and Values

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience
Experience with generative AI
Experience with distributed computing
Python and Python ecosystem (eg. uv, pybind/nanobind, FastAPI) expertise
Experience with large scale production inference
Experience with GPU kernel programming
Experience benchmarking, profiling, and optimizing PyTorch generative AI models
Experience with open source inference frameworks like vLLM and SGLang
Working experience and conversant with the material in the JAX scaling book

Nice to have

Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience

Microsoft Corporation - All Job Offers

Select Country

Member of Technical Staff, LLM Inference - MAI Superintelligence Team

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Member of Technical Staff, LLM Inference - MAI Superintelligence Team

Pharmacy Technician

Pharmacy Technician

Staff Pharmacist

Staff Pharmacist Full Time

Pharmacist

District Support Pharmacist

Staff Pharmacist PT

Pharmacy Intern

Our AI answers in your language