This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Join Microsoft’s CoreAI group as a Principal Research Engineer on the AI Data Platform team—the foundation for secure, scalable, reusable datasets that power AI model development across the company. This central platform manages the full lifecycle of Microsoft’s AI training data, accelerating model development with high-quality, compliant, and reusable datasets and services.
Job Responsibility:
Design and build a data quality evaluation framework for AI training datasets, including scalable metrics, testing methodologies, and automated reporting
Define and operationalize quality signals aligned to model outcomes (e.g., coverage, diversity, noise/duplication, labeling consistency, safety/toxicity, privacy/compliance risk indicators)
Collaborate with cross-functional stakeholders to run experiments, establish best practices, and deliver reusable tools that scale across multiple model and product teams
Develop task- and model-aware evaluation approaches that connect dataset properties to training performance, reliability, and safety
Create automated dataset validation gates and monitoring to support continuous dataset iteration (e.g., regression detection across dataset versions)
Design and implement synthetic data generation pipelines (LLM-driven and programmatic approaches) to improve long-tail representation, fill coverage gaps, and accelerate iteration cycles
Build guardrails for synthetic data: filtering, scoring, calibration, provenance tracking, and bias/safety checks to ensure quality and compliance
Partner with engineering to integrate evaluation and generation into the platform’s end-to-end data lifecycle
Requirements:
Bachelor's Degree in Computer Science, Electrical or Computer Engineering, or related field AND 6+ years related experience (e.g., statistics, predictive analytics, research)
OR Master's Degree in Computer Science, Electrical or Computer Engineering, or related field AND 4+ years related experience
OR Doctorate in Computer Science, Electrical or Computer Engineering, or related field AND 3+ years related experience
OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Nice to have:
Doctorate in Computer Science, Electrical or Computer Engineering, or related field AND 3+ year(s) related experience
5+ years of coding experience in Python and experience with ML frameworks such as PyTorch and Triton
3+ years experience of large-scale model training for LLMs, SLMs, and agentic models
3+ years of proven ability to design and scale training infrastructure and pipelines in production environments
Experience with agent training frameworks
Demonstrated experience developing synthetic data generation pipelines to enable SFT and RL training of agentic models
Hands-on experience with large-scale distributed training and/or serving with demonstrated ability to dive deep into complex systems, troubleshoot unconventional issues, and craft innovative solutions under real-world constraints
Extensive experience with large-scale training, model inference, reinforcement learning, and reasoning models
Demonstrated ability to work in cross-functional teams and collaborate effectively with researchers, product managers, and other engineers to deliver complex ML solutions
Startup-style mindset: agile, solution-oriented, and self-driven