This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Follow-on Suggestions team in STCI, in the Microsoft AI (MAI) organization develops these features for Copilot and Bing. The team is recruiting a Software Engineer 2 to lead development of follow-on suggestion experiences that reduce user friction and enable task completion. A suitable candidate will have solid Software Engineering skills with practical experience in integrating machine learning models in production workflows in both online and offline settings. Moreover, the candidate will also have solid experience in building and optimizing distributed systems that are able to leverage available compute and storage resources to build efficient applications for serving users. We hire people motivated to solve the hard problems, keen to work in a larger team of data scientists and engineers, and ready to make a difference in how search and AI assistant landscape evolves. As a Software Engineer with applied AI focus, you will apply both software engineering and AI expertise to build intelligent, scalable solutions. You’ll collaborate with cross-functional teams to deliver high-impact features aligned with enterprise standards and cloud-scale requirements.
Job Responsibility:
Design, implement, and ship AI-first product capabilities end-to-end from rapid prototype to production, spanning LLM-powered services, retrieval/grounding pipelines, and intelligent UX experiences
Own implementation across the full stack integrating front-end experiences, back-end services, and AI orchestration layers that connect models, context, and tools to deliver cohesive, extensible, high-performance recommendation systems
Collaborate with design, research, and platform teams to adapt or fine-tune LLMs/SLMs for follow-on scenarios
Build agentic, tool-using workflows that reason across data and services
optimize for security, safety, latency, reliability, and cost efficiency
Contribute to engineering excellence secure-by-design, accessibility compliance, automated testing, and code craftsmanship across the product lifecycle
Instrument and evaluate AI features with telemetry, experimentation, and continuous feedback loops to improve user experience
Drive live-site reliability and operational excellence, participating in On-Call rotations while maintaining a sustainable, high-ownership engineering culture
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience, and experience with generative AI
Drive experimentation through A/B testing and offline evaluation to evaluate system performance
Comfortable driving complex server and client architecture across large product teams
Hands-on experience with modern LLM evaluation techniques, including LLM-as-a-Judge, agentic evaluations and RAG assessments
A track record of delivering successful, large scale applied ML projects in an industry setting
Experience with MLOps practices, model versioning, automated testing, monitoring and CI/CD for machine learning
Experience with proficient coding, debugging, and problem-solving skills
Outstanding communication and collaboration skills