This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Azure is Microsoft’s central cloud infrastructure that supports public cloud services and many Microsoft-internal cloud scale systems. Cloud computing is a competitive and rapidly expanding industry, and Azure aims to lead across all key areas of its platform and services. Within Azure, the Azure Compute team provides core infrastructure capabilities for hosting virtual machines, containers, and other workloads. A foundational discipline in cloud computing is capacity management. Effective capacity management ensures that all regions, allocation domains, and hardware platforms have the resources needed to meet customer demand, while also preventing unnecessary spending and reducing cost of goods sold (COGS) and capital expenditures (CAPEX). At Azure’s scale, balancing these priorities across the entire Azure Compute fleet is highly complex, and improvements can prevent allocation issues while enabling significant cost savings. The Azure Compute Capacity and Efficiency team, also known as AC2E, is responsible for end-to-end capacity and efficiency management across the fleet. The team builds a fully automated, optimized tracking and management system, with the Capacity Management Automation System (CMAS) as a core component. These systems use advanced algorithms and apply artificial intelligence to predict capacity risks and trigger appropriate mitigation actions within the Azure Compute platform. Team members work across engineering, program management, and data science to define business problems, design solutions, and contribute to strategic decisions that influence Azure Compute’s capacity and efficiency.
Job Responsibility:
Design new tools and processes to enable better data modeling, analysis, and experimentation for capacity across Azure
Understand platform capacity constraints and work with teams across Azure to improve capacity manageability and efficiency
Build models, simulations, scalable and automated analytical systems and data mining frameworks to derive profound insights into the Azure Compute platform and its efficiency and capacity
Drive improvements to the product design and architecture, leading to increased customer satisfaction
Lead and collaborate with experts from across the company to advance capacity management, capacity planning, and efficiency
Contribute to the team culture and apply best practices in your day to day work
Requirements:
BS in Computer Science or equivalent
2+ years of software development hands-on industry experience working on cloud infrastructure-related problems, with impact on critical product and business decisions
Azure Cloud Services development experience, or related
Programming skills (esp. related to data technologies like Python, PERL, Java, C#, etc.)
Proficiency with relational databases (Kusto, SQL or similar)
Good understanding of a modern state-of-the-art cloud platform, and related technologies
A proven track record of collaborating across organizational boundaries and delivering great results
Comfortable to work across the boundary between data science and software engineering
Nice to have:
Master's Degree in Computer Science or related field
1+ years software development experience or equivalent experience
Experience with Globally Distributed cloud systems with focus on quality and scalability
Experience with working across data science and software development boundary