This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Take ownership of Microsoft 365 service operations in sovereign cloud environments where availability and compliance are essential. You'll deploy, monitor, and improve services across the M365 stack, building automation, driving operational excellence, and partnering with engineering teams to ensure service health meets the highest standards. Bring your passion for operations and automation and help us deliver a world-class service experience.
Job Responsibility:
Responds to incidents during regular on-call rotations, including complex incidents with major customer or business impact, by identifying the level of impact, troubleshooting, contributing to difficult decisions based on business impact, deploying appropriate fixes to resolve root cause(s), and implementing automations for prevention of recurring incidents through coordinating resources required for incident resolution, which may include product teams, owners, leadership, other engineering teams, and/or subject matter experts
Creates, monitors, and takes action on telemetry data and influences telemetry analytics to better identify patterns that reveal errors and unexpected problems that are affecting the system's availability, reliability, performance, and/or efficiency
Independently implements reliable, scalable, and high-performance solutions across teams
Leverages advanced technical expertise, judgment, and decision making to coordinate multiple work streams and resources in crisis situations to drive mitigation plan and resolve, reduce, or mitigate the impact of a crisis
Collaborates within and across teams by proactively and systematically sharing information with an appropriate level of detail for their audience
Shares insights and best practices that can be applied to improve development and operations across related sets of the systems, services, platforms, and/or products
Monitors and maintains security by addressing security vulnerabilities through patches, reconfigurations, and/or settings updates
Requirements:
Bachelor's Degree in Computer Science, Information Technology, Mechanical Engineering, Electrical Engineering, Aerospace Engineering, Data Science, Cybersecurity, or related field AND 4+ years technical experience in software engineering, network engineering, service engineering, systems engineering, or industrial controls OR equivalent experience
Hands-on experience supporting complex IT environments, with a strong understanding of system and service management challenges
Experience operating in large distributed or air-gapped environments, with a focus on reliability, security, and compliance
Ability to build consensus and influence across teams to achieve common goals
Recent experience with Azure or equivalent hyperscale cloud technologies
Candidates must be able to meet Microsoft, customer and/or government security screening requirements are required for this role, including an active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph
This position requires verification of U.S. citizenship due to citizenship-based legal restrictions
Nice to have:
Master's Degree in Computer Science, Information Technology, Mechanical Engineering, Electrical Engineering, Aerospace Engineering, Data Science, Cybersecurity, or related field AND 5+ years technical experience in software engineering, network engineering, service engineering, systems engineering, or industrial controls OR Bachelor's Degree in Computer Science, Information Technology, Mechanical Engineering, Electrical Engineering, Aerospace Engineering, Data Science, Cybersecurity, or related field AND 5+ years technical experience in software engineering, network engineering, service engineering, systems engineering, or industrial controls OR equivalent experience
3+ year(s) technical experience working with large-scale cloud or distributed systems