This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Be at the forefront of our Microsoft 365 Resilience efforts, by leading the development and architecture of our most critical monitoring and alerting. Identifying critical paths in highest priority scenarios including Copilot and develop and work with service teams to build robust reliability measures including Graceful Degradation and failure modes. As a Principal Software Engineer, you will transform and evolve how our critical paths are monitored, measured and designed reliably. You'll work directly on the probes, monitoring and alerting that orchestrates the most critical paths across Microsoft 365. Empowering Microsoft's M365 Core Platform and Copilot teams to measure reliability and monitor service health with rigor. This opportunity will allow you to dive deep on Microsoft's M365 Core Platform, technologies, and rapidly grow your career. The M365 Foundation team is a core pillar within Microsoft's M365 Core Platform and Services organization, responsible for ensuring the reliability, resilience, performance, and scalability of the platform that underpins Microsoft 365 services. The team drives strategic investments across AI Evaluations, Performance & Efficiency, Change Management, Reliability & Resilience, Observability & Intelligent Cloud, and Fleet & Capacity, with an emphasis on trust, security, and operational excellence. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees, we come together with a growth mindset, innovate to empower others, and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
Job Responsibility:
Lead product development and scaling to customer requirements and apply best practices for meeting scaling needs and performance expectations and holds accountability for products that do not meet expectations
Partner with stakeholders to determine user requirements within and across teams
Proactively seek new knowledge and adapt to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale and share knowledge with other engineers
Guide the team and lead identification of dependencies and the development of design documents for a product, application, service, or platform
Guide the team to drive multiple group project plans, release plans, and work items in coordination with appropriate stakeholders
Act as an expert and DRI, be on call to mitigate system/product/service degradation to avoid downtime or interruptions
Embody our culture and values
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR equivalent experience
Analytical mindset with a data-driven approach to problem-solving, consistently upholding high standards of quality and engineering rigor
Collaborative and team-oriented, skilled at articulating complex ideas across disciplines, levels, and product areas to drive alignment and shared success
Experience independently owning and delivering technically challenging projects with measurable impact
Demonstrated ability to quickly master new technologies, tools, and domains
Nice to have:
Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR equivalent experience
Analytical mindset with a data-driven approach to problem-solving, consistently upholding high standards of quality and engineering rigor
Collaborative and team-oriented, skilled at articulating complex ideas across disciplines, levels, and product areas to drive alignment and shared success
Experience independently owning and delivering technically challenging projects with measurable impact
Demonstrated ability to quickly master new technologies, tools, and domains