This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Core AI is at the forefront of Microsoft's mission to redefine how software is built and experienced in the AI era. Our product portfolio includes vital developer tools like VS Code, Visual Studio, GitHub, AI Foundry, and others. Our work enables developers and enterprises to harness the full potential of AI to create intelligent, adaptive, and transformative software. Guidance is an applied research and development team and technology stack focused on providing more precise control over language models. The team operates across the full research and development lifecycle, from research ideation through production deployment, with a primary focus on language model engines. Guidance advances engine‑level capabilities by developing new techniques that improve model accuracy, speed, reliability, and expressivity across a wide range of execution environments. An industry example of this work is llguidance, which originated structured output capabilities and helped drive their adoption across first‑party Microsoft engines and third‑party model providers. You will work as a Senior Research Software Development Engineer focused on advancing language model engine‑level capabilities through applied research to integration. This role is responsible for integrating in‑house techniques and state‑of‑the‑art research into a variety of first‑party (1P) Microsoft engines and third‑party (3P) industry engines. You will translate research ideas into high‑performance, production‑ready implementations, contributing directly to new engine capabilities that improve model correctness, efficiency, robustness, and expressive control.
Job Responsibility:
Advance language model engine capabilities through applied research and production engineering, integrating in‑house innovations and state‑of‑the‑art techniques to improve model accuracy, speed, reliability, and expressivity across first‑party and third‑party engines
Design, implement, and review performance‑critical engine code (primarily in Python and Rust), ensuring high standards for correctness, test coverage, security, diagnosability, and maintainability, while coaching peers through rigorous and timely code reviews
Apply AI‑native development practices across the full SDLC, using AI tools responsibly for design, coding, testing, and analysis, and taking ownership of the quality and correctness of AI‑assisted outputs while helping establish best practices across the team
Develop and evolve advanced inference techniques (e.g., speculative decoding, constrained decoding, structured generation), validating design choices through experimentation, benchmarking, and production telemetry
Own engine‑level design and integration decisions, producing clear design documents, evaluating trade‑offs across multiple architectural options, and collaborating across teams to ensure solutions meet requirements for performance, scalability, reliability, security, and cost
Drive engineering excellence in production environments, including comprehensive testing strategies, observability, live‑site readiness, incident response, and post‑incident learning, with a focus on reducing operational risk in multi‑tenant inference systems
Contribute to and leverage open‑source LM infrastructure where appropriate, responsibly reusing and extending external code, sharing learnings with the broader community, and continuously staying current with emerging research, tools, and engine‑level techniques
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to Rust or C++, and Python
OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years
Nice to have:
Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, Rust or C++, and Python
OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, Rust or C++, and Python
OR equivalent experience
5+ years of professional software engineering experience, including ownership of complex, production‑quality systems
Strong proficiency in Python and at least one systems programming language (e.g., Rust, C++, or equivalent), with experience writing and maintaining performance‑critical code
Open‑source contributions or industry experience in language model infrastructure (e.g., vLLM, sglang, llguidance, or comparable LM libraries), including work on core engine logic rather than application layers
Hands‑on familiarity with advanced inference techniques, such as speculative decoding, constrained decoding, or related inference‑time capabilities