This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Azure High Performance Computing and AI Platform (HPC/AI) group is the team behind Azure’s cloud offering that powers some of the most demanding and largest scale AI training and inference workloads in the industry. The virtual machine (VM) series that our team owns combine cutting edge GPUs and accelerators, as well as a state-of-the-art scale-out network infrastructure to enable these workloads. We collaborate with many Microsoft teams and our industry partners to design and bring up the underlying platform, and we build the software to expose this platform as an Azure service. As a Principal Software Engineer in the Azure HPC/AI team, you will play a critical role in delivering the next generations of our platform by solving technical problems at all levels of the stack, contributing to our codebases to enable new features, working on architectural proposals, and collaborating with our internal and industry partners. This position involves deep technical work that primarily focuses on HW/SW interactions, device virtualization, and performance analysis of GPU workloads in VMs. Since our team is also responsible for vertical integration of our services, you will also have the opportunity to work with upper layers of the Azure infrastructure . It is an exciting time for the team as we are working on expanding the capacity and range of supported scenarios to support the next 100X growth.
Job Responsibility:
Willing to dive deeply into any level or layer of a problem.
Willing to learn emerging technologies, from hardware to software. Evaluate and make recommendations that advance Azure infrastructure for AI and other GPU-based workloads.
Leads by example within the team by producing extensible and maintainable. Optimizes, debugs, refactors, and reuses code to improve performance and maintainability, effectiveness, and return on investment (ROI). Applies metrics to drive the quality and stability of code, as well as appropriate coding patterns and best practices.
Maintains communication with key partners across the Microsoft ecosystem of engineers. Acts as a key contact for leadership to ensure alignment with partners' expectations. Considers partner teams across organizations and their end goals for products to drive and achieve desirable user experiences and fitting dynamic needs of partners/customers through product development.
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience.
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.
Nice to have:
Bachelor's Degree in Computer Science OR related technical field AND 10+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, OR Python
OR Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience.
Machine Learning & AI Expertise Familiarity with ML concepts, AI infrastructure, and accelerators
experience with HPC/ML middleware and profiling/performance analysis tools.
Systems & Virtualization Strong understanding of operating systems fundamentals, virtualization technologies, and distributed systems.
Hardware-Software Co-Design Experience in co-designing hardware and software for optimized performance.