This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Azure High Performance Computing & Artificial Intelligence (HPC & AI) team is responsible for defining, building, and managing AI supercomputers at a massive scale, powering some of the premier models and workloads in the industry. Our mission goes beyond delivering supercomputers; we ensure customers have a world-class experience, empowering them to achieve their best work on infrastructure that “just works.” We are seeking a Senior Product Manager to help deliver and manage some of the largest supercomputers in the industry. This role will partner closely with end customers to understand their needs and drive improvements, proactively identifying pain points, service-impacting issues, and feature gaps. The Senior Product Manager will collaborate with Site Reliability Engineers (SREs) and developers to resolve challenges and introduce new tooling and features, ensuring a continually improving experience for our customers.
Job Responsibility:
Engage directly with end customers to understand their needs, gather feedback, and proactively identify service-impacting issues and feature gaps
Partners with Site Reliability Engineers and developers to resolve support drivers, introduce new tooling and features, and ensure continuous improvement of the customer experience
Collaborates with partner teams, systems integrators, and other vendors to build golden configurations and deployment strategies with minimal user impact
Collects and reviews service telemetry to identify the most impactful issues affecting service health, uncovers opportunities to enhance the customer experience, and drives iterative improvements for ongoing service quality
Leverages data and AI tools to analyze user behavior or feedback, working with data scientists or engineering to implement AI-driven improvements
Incorporates security factors and implements security-by-default principles in features, services, and experiences
Requirements:
Bachelor's Degree AND 5+ years experience in product/service/program management or software development OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Nice to have:
Bachelor's Degree AND 8+ years experience in product/service/program management or software development OR equivalent experience
2+ years experience taking a product, feature, or experience to market (e.g., design, addressing product market fit, and launch, internal tool/framework)
4+ years experience improving product metrics for a product, feature, or experience in a market (e.g., growing customer base, expanding customer usage, avoiding customer churn)
4+ years experience disrupting a market for a product, feature, or experience (e.g., competitive disruption, taking the place of an established competing product)
2+ years experience managing or leading cross-functional hardware programs involving coordination with multiple stakeholders, tracking KPIs, and enhancing the overall customer experience
3+ years experience partnering with SRE or infrastructure engineering teams to improve service health, including defining and tracking availability, reliability, or “nodes in service” metrics, and driving remediation plans for the top quantified issue drivers
3+ years experience owning and prioritizing a product backlog using operational and customer signals (e.g., incident data, telemetry, customer feedback), translating those inputs into prioritized features, bug fixes, and action plans delivered through regular release cycles
3+ years experience owning the end-to-end user experience for a shared or multi-tenant platform, including managing competing customer requirements, defining success metrics (e.g., CSAT, adoption, availability), and delivering measurable improvements to usability and service quality