This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Microsoft AI Asia Platform Team builds the foundational infrastructure powering Microsoft's AI products across Azure, Copilot, Bing, and internal engineering systems. We develop enterprise-grade Agent Runtime, large-scale model training/serving frameworks, and next-generation AI developer toolchains. Our infrastructure directly impacts how millions of developers build and deploy AI applications globally. We are seeking a software engineer with solid multi-language programming capabilities to build high-performance, reliable infrastructure for Agent systems. You will work on the core runtime for autonomous agents, distributed LLM serving, and RL-based training pipelines—turning cutting-edge AI research into production-grade platform services.
Job Responsibility:
Agent Runtime Development: Design and implement high-throughput runtime systems supporting Tool Use, Function Calling, and multi-agent orchestration at scale
LLM Serving Infrastructure: Optimize inference stacks including request scheduling, KV cache management, continuous batching, speculative decoding, and model parallelism
AI Developer Tooling: Develop AI-assisted development tools and applications (similar to Claude Code, OpenClaw) to boost engineering / work productivity
Platform Abstraction: Transform experimental agent capabilities into reusable platform APIs, SDKs, and managed services for upstream product teams
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Systems engineering fundamentals: Deep understanding of concurrency, memory management, performance optimization, and distributed systems architecture
Agent architecture knowledge: Solid grasp of modern agent paradigms (ReAct, CoT, Function Calling, Agent State Management) with hands-on implementation experience
LLM infrastructure foundation: Understanding of Transformer inference mechanics, experience with serving frameworks (vLLM, TensorRT-LLM, Triton, or custom stacks)
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Nice to have:
Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
RL Training Experience: Hands-on experience with RLHF, DPO, PPO, or offline RL for language model alignment
High-Performance Deployment: Expertise in model quantization (GPTQ/AWQ/GGUF), compiler optimization (MLIR/TVM), or heterogeneous hardware acceleration (GPU/TPU/NPU)
Agent Systems Depth: Contributions to open-source agent frameworks (LangGraph, AutoGen, OpenHands, CodeR) or deep technical analysis of Claude Code/ OpenClaw implementations
Cloud-Native Engineering: Experience building services on Kubernetes, service mesh architectures, or Azure/GCP/AWS platforms at scale