This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Joining the CoreAI organization at Microsoft means becoming part of the team that builds the end-to-end AI stack powering Azure’s innovation. As a member of the FIT training team within CoreAI, you will help develop the AI infrastructure that accelerates the creation of agentic AI systems across Microsoft. This role is dedicated to advancing scientific methods and scalable infrastructure for training agentic models to achieve frontier-level performance. You will contribute to LLMs, SLMs, and agentic models using both proprietary and open-source frameworks, all aimed at delivering reliable, enterprise-grade agentic workflows. We are seeking a curious, independent, adaptable problem-solver who thrives on continuous learning, embraces changing priorities, and is motivated by creating meaningful impact. Candidates must be able to lead and role model for team that is driven, able to write efficient code, debug complex training jobs, document findings, and demonstrate a track record of continuous improvement. In addition, we value an agile, startup-style mindset - someone who can iterate quickly, pivot when needed, and collaborate effectively in fast-paced, dynamic environments. Microsoft’s mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day.
Job Responsibility:
Engage directly with key partners to understand and implement complex inferencing and agentic capabilities for Microsoft Copilot and other Microsoft products and Azure services
Design and implement API orchestration layer by leveraging OpenAI models, tools and capabilities
Work on cutting edge agentic platforms and automate and solve real-world problems with latest and greatest reasoning AI models
Work with cutting edge hardware stacks and a fast-moving software stack to deliver best of class inference and optimal cost
Anticipate, identify, assess, track, and mitigate project risks and issues in a fast-paced start up like environment
Motivated to build constructive and effective relationships and solve problems collaboratively
Support production inference SLAs for core AI scenarios on one of the largest GPU fleets in the world
Requirements:
Bachelor's Degree in Computer Science or related technical field and 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, or equivalent experience
Experience in distributed computing and architecture, and/or developing and operating high scale, reliable online services
Nice to have:
Knowledge and experience in Docker, Kubernetes, CI-CD pipelines and devops on micro-services running in Kubernetes clusters
Experience in Rust programming languages
Practical experience working on real-world applications that create or customize AI Agent to automate real-world tasks
Experience in developing low latency systems
Experience working in a geo-distributed team
Understanding of parallel algorithms for communication between GPUs, familiarity with related libraries and frameworks such as DeepSpeed, PyTorch Distributed
Knowledge of LLM model architectures e.g. GPT, Claude, DeepSeek etc.