This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a Principal Product Manager/Architect to define and guide the technical architecture of Microsoft Foundry as the most reliable, scalable, and efficient AI inferencing platform in the industry. This role sits at the intersection of platform architecture, largescale GPU fleet management, and strategic customer engagement, with end-to-end accountability for the product direction that shape reliability, efficiency, and customer trust at global scale. This leader will partner with Engineering and Product Management leaders to drive reliability, efficiency and strategic customer engagement while remaining deeply engaged in nearterm execution. The role partners closely with engineering, product, and customer teams across CoreAI, Azure, and 1P products to ensure Foundry delivers industryleading reliability, worldclass GPU efficiency, and differentiated value for Microsoft’s most strategic AI customers.
Job Responsibility:
1. Product Reliability: Own the product direction for Microsoft Foundry inference, with a primary mandate to make the platform the most reliable enterprise inferencing service available. This includes defining architectural standards for global serving, multi-region resiliency, automated failover, and platform-managed disaster recovery
Drive architectural alignment across global routing, capacity pooling, observability, and control plane abstractions to ensure consistent availability, predictable recovery behavior, and simplified customer operations at scale
Partner with engineering, infrastructure, and security leaders to ensure reliability targets, SLAs, SLOs and recovery objectives are designed into the platform by default
2. GPU Fleet Efficiency & Capacity: Set the product direction for GPU fleet efficiency and capacity management, guiding platform-level design decisions that maximize utilization, minimize fragmentation, and accelerate timetomonetization of new hardware and models
This includes shaping the architecture for global capacity pooling, intelligent scheduling, fungibility across workloads, automated demand forecasting, and softwaredefined allocation
The Product Manager/Architect is expected to influence architectural investments across inference utilization, model serving, and hardware/system performance
3. Strategic Customer & Innovation Engagement: Act as a senior technical advisor and architect for Foundry’s most innovative and strategic customers
Engage directly with customers on deep technical challenges, including largescale model migrations, reliabilitysensitive production deployments, and advanced serving architectures
Support competitive and strategic initiatives by articulating Foundry’s architectural advantages, turning bespoke requests into scalable features
4. Cross-Company Technical Leadership: Serve as a unifying architectural voice across product management, engineering, infrastructure, and partner teams
Drive alignment on longterm technical direction, resolve architectural tradeoffs, and provide clear guidance on when to optimize for reliability, efficiency, performance, or speed
Engage with senior Microsoft leadership across 1P teams, producing architectural briefs, decision frameworks, and recommendations
Requirements:
Bachelor's Degree AND 10+ years experience in product/service/program management or software development OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check
Nice to have:
Proven technical leadership with deep experience designing and operating planet-scale distributed systems, preferably in cloud, AI, or highperformance compute platforms
Proven track record owning endtoend architecture for missioncritical services with strong availability, resilience, and operational guarantees
Deep understanding of GPU-backed inference systems, capacity management, scheduling, and performance optimization at scale
Demonstrated ability to engage credibly with strategic enterprise customers, solving complex architectural problems and influencing platform direction based on real-world needs
Exceptional communication skills, with the ability to translate complex technical concepts into clear guidance for executives, partners, and customers