This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Do you want to be at the heart of cloud computing? The Compute team is at the core of Azure and is growing incredibly fast. We build and manage fault tolerant distributed systems on top of commodity datacenter hardware, to deliver an infrastructure for hosting customer applications. The platform is at the core of Azure that provides millions of virtual machines for customers to run their workload in the cloud. Our team fosters a collaborative environment and builds upon each other’s ideas, to deliver world-class customer value at a rapid pace. We empower engineers to deliver creative solutions through bottoms-up innovation. This is a fun environment and a great opportunity to work on something highly strategic to Microsoft and extremely relevant in the industry. We’re looking for a Senior Site Reliability Engineer and a leader passionate about delivering value to customers in mission critical environments, who enjoys a growth hacking culture, and is eager to play a part in one of the most important long games for Microsoft. Microsoft’s mission is to empower every person and every organization on the planet to achieve more.
Job Responsibility:
Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate
Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of service fabric services while also driving consistency in monitoring and operations at scale
Drives development of design documents for a product, application, service, or platform
Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI)
Leverages subject-matter expertise of product features and partners with appropriate stakeholders to drive a workgroup's project plans, release plans, and work items
Take full ownership of assigned services, actively contributing to its enhancement across all cloud environments
Identify opportunities for automation and optimization within the cloud to better support customers
Requirements:
Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience
Active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph
Ability to meet Microsoft, customer and/or government security screening requirements
Must pass Microsoft Cloud background check upon hire/transfer and every two years thereafter
Verification of U.S. citizenship due to citizenship-based legal restrictions
Nice to have:
Doctorate Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration OR Master's Degree in Computer Science, Information Technology, or related field AND 6+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 8+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience
3+ years technical experience working with large-scale cloud or distributed systems
Experience writing scripts and functional programming code to automate tasks, using languages such as Python, JavaScript, or Shell scripting
Experience developing end-to-end technical expertise in the architecture, code, features, and operations of specific products as required to implement improvements in product availability, security, quality, observability, reliability, efficiency, observability, and/or performance
Experience driving code/design reviews with the engineering teams that develop and/or manage those products and shares learnings and recommendations across engineering teams working on related products within their organization and other organizations as relevant
Knowledge of distributed systems
Highly effective written and oral communication skills
What we offer:
Certain roles may be eligible for benefits and other compensation