This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Microsoft’s Azure Data engineering team is looking to hire a Site Realiability Engineer. The team is leading the transformation of analytics in the world of data with products like databases, data integration, big data analytics, messaging & real-time analytics, and business intelligence. The products our portfolio include Microsoft Fabric, Azure SQL DB, Azure Cosmos DB, Azure PostgreSQL, Azure Data Factory, Azure Synapse Analytics, Azure Service Bus, Azure Event Grid, and Power BI. Our mission is to build the data platform for the age of AI, powering a new class of data-first applications and driving a data culture. Within Azure Data, the Microsoft Fabric platform team builds and maintains the operating system and provides customers a unified data stack to run an entire data estate. The platform provides a unified experience, unified governance, enables a unified business model and a unified architecture. This team (SRE) ensures the reliability, scalability, and performance of systems and services. By integrating software engineering with IT operations, the team automates processes, manages incidents, and enhances system resilience. Acting as a bridge between development and operations, SREs help organizations maintain highly reliable and efficient systems while enabling fast and seamless software delivery.
Job Responsibility:
Work with all aspects of a high throughput and multi-tenant service
Collaborate effectively within the team and with partner teams across Microsoft
Be part of the on-call rotation for maintaining service health
Design, implement, and refine chosen solutions in close partnership with Product Management and partner teams
Champion operational excellence via established metrics, process governance, and policy controls for regular assessment and improvement
Document and define existing data engineering processes, data and technology, while evaluating them for optimization
System Reliability & Uptime – Ensuring high availability of services
Incident Management – Detecting, responding to, and mitigating system failures
Performance Monitoring – Tracking system health and resolving bottlenecks
Automation & Tooling – Reducing manual work through scripts and automation
Capacity Planning – Scaling infrastructure efficiently to handle demand
Postmortems & Continuous Improvement – Analyzing failures to prevent recurrence
Embody our culture and values
Requirements:
Master's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Nice to have:
5+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Master's Degree in Computer Science, Information Technology, or related field AND 1+ year(s) technical experience in software engineering, network engineering, or systems administration
4+ years technical experience in software engineering, network engineering, or systems administration OR bachelor's degree in computer science, Information Technology, or related field AND 2+ year(s) technical experience in software engineering, network engineering, or systems administration OR Master's Degree in Computer Science, Information Technology, or related field
2+ years’ experience with scripting languages such as PowerShell, Python etc
Experience writing code to automate day-to-day tasks