This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Microsoft is a highly innovative company that collaborates across disciplines to produce cutting-edge cloud technology that changes the world. The Cloud Server Infrastructure team within Microsoft Azure builds and operates the hardware and software foundation powering Microsoft’s global cloud services. The platform spans massive scale across global datacenters, requiring continuous innovation in reliability, efficiency, performance, automation, and serviceability. The team partners across hardware, firmware, software, and operations to improve cloud infrastructure availability and enable next-generation systems. Microsoft also contributes to industry efforts such as Project Olympus and the Open Compute Project to accelerate open hardware innovation. We are looking for a highly motivated Software Development Engineer to build and operate large-scale cloud infrastructure systems that power Azure. In this role, you will design and develop software that manages and monitors cloud hardware across hyperscale environments, driving improvements in reliability, availability, and performance. You will work across the full stack—from low-level Linux-based device software to cloud-scale service orchestration—enabling intelligent decision-making on hardware health signals and improving overall cloud hardware availability and performance. This role provides an opportunity to influence next-generation datacenter architecture and contribute to industry ecosystems such as the Open Compute Project, aligning with Microsoft’s open hardware innovation strategy.
Job Responsibility
Design, develop, and maintain Linux-based service and device management stack using C, C++, Python, and systems programming languages
Build and optimize distributed systems and cloud services for monitoring and managing hardware at hyperscale
Implement hardware interface programming (SPI, I2C, GPIO, UART) and support board bring-up, firmware, and Linux boot flows including U-Boot and kernel integration
Develop and enhance device telemetry, health monitoring, hardware health signal processing, and automated remediation workflows
Drive live-site excellence through monitoring, debugging, root cause analysis, repair loops, and continuous service reliability improvements
Collaborate with hardware, firmware, platform, and partner teams to deliver end-to-end solutions across hardware-software boundaries
Translate customer and production feedback into feature enhancements, bug fixes, reliability improvements, and supportability investments
Leverage and contribute to open-source ecosystems such as OCP and Linux where appropriate, bringing relevant best practices into Microsoft platforms
Demonstrate end-to-end ownership for components and features, including design, implementation, validation, deployment, and production support
Apply strong debugging and problem-solving skills in complex distributed, embedded, and hardware-software integrated environments
Deliver secure, maintainable, and high-quality code with clear design documentation, unit/integration coverage, and operational readiness
Participate actively in code reviews, design reviews, technical decision-making, and cross-team alignment
Balance feature delivery with reliability, scalability, observability, performance, and long-term supportability
Communicate clearly with peers, stakeholders, partner teams, and customers
convert ambiguity into actionable engineering plans
Requirements
Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check
Nice to have
Ph.D. in Computer Science, Computer Engineering, Electrical Engineering, or related field
OR M.S. with 4+ years of industry development experience
OR B.S. with 8+ years of industry development experience
Strong proficiency in C/C++ systems programming and software design fundamentals
Experience with Python and scripting languages for automation, diagnostics, and tooling
Solid understanding of operating systems concepts, Linux/Unix environments, and system-level debugging
Experience with multi-threaded, concurrent, and user-mode programming
Knowledge of user-kernel interactions, system interfaces, and hardware-software integration concepts
Working knowledge in one or more of the following: C#, .NET or other Object-oriented languages is desirable
Experience with Azure services and database query language such as KQL/Kusto is desired but optional
Strong problem-solving and debugging skills in production or complex integration environments
Effective written and verbal communication skills with the ability to explain technical findings clearly
Experience with Linux kernel development, device drivers, or low-level system components
Exposure to firmware development, BMC, Rack Manager, platform management, or embedded systems
Familiarity with distributed systems, microservices, cloud services, or infrastructure automation
Experience with telemetry pipelines, observability tools, monitoring, alerting, and live-site diagnostics
Knowledge of datacenter hardware architecture, hardware health management, and operational workflows
Understanding of security fundamentals such as secure boot, authentication, authorization, certificate handling, and secure update flows
Experience contributing to open-source projects or engaging with industry communities such as Linux and OCP