This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Azure Resource Manager (ARM) is Azure's control plane - a massive scale distributed service that enables users to create, update, and delete Azure resources using a uniform set of APIs and tools. ARM supports infrastructure as code through declarative templates, allowing for repeatable, secure, and scalable deployments across environments. Join the ARM team as a Senior Software Engineer – Site Reliability Engineer (SRE) and help build the most reliable public cloud control plane on the planet. In this role, you’ll lead the design and implementation of resilient, scalable systems that ensure the reliability and performance of ARM and Azure Resource Providers. You’ll drive operational excellence, and build AI-powered solutions to automate incident response and empower developers to deliver with confidence. You will also guide and mentor less experienced team members.
Job Responsibility:
Influence and create new designs to improve the availability, scalability, latency, and efficiency of Azure Resource Manager
Continuously improve the observability of ARM service across areas such as monitoring & alerting, SLOs, and debuggability
Troubleshoot and mitigate complex infrastructure and network issues, and proactively implement measures to reduce reoccurrence and impact of future incidents
Participate in regular on-call rotations and share details related to incidents and their resolution through post-mortem reports and regular review meetings
Mentor and coach less experienced engineers
Embody our culture & values
Requirements:
Master's Degree in Computer Science, Information Technology, or related field AND technical experience in software engineering, network engineering, or systems administration
OR Bachelor's Degree in Computer Science, Information Technology, or related field AND technical experience in software engineering, network engineering, or systems administration
OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check
Technical experience working with large-scale cloud or distributed systems
Experience running highly-available, mission-critical large-scale distributed systems
Doctorate Degree in Computer Science, Information Technology, or related field AND technical experience in software engineering, network engineering, or systems administration