This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Do you want to be at the forefront of innovating the latest hardware designs to propel Microsoft’s cloud growth? Are you seeking a unique career opportunity that combines technical capabilities, cross-team collaboration, with business insight and strategy? Join the Systems Planning and Architecture (SPARC) team within Microsoft’s Azure Hardware Systems and Infrastructure (AHSI) organization, the team behind Microsoft’s expanding Cloud Infrastructure and for powering Microsoft’s “Intelligent Cloud” mission. We are seeking a passionate Principal AI Network Architect to join the AI systems architecture team. The role includes network architecture evaluation, design and optimization for next-gen AI systems. Your work will have a direct influence on Azure product roadmaps.
Job Responsibility:
Leadership: Spearhead architecture definition and evaluation of AI accelerator platforms, with a focus on high bandwidth, low latency networks. Drive end to end optimization of the stack from hardware, the software kernels
Cross functional collaboration: Partner with silicon and platform design teams to co-design infrastructure that meets performance, reliability and deployment goals. Frame decisions in terms of TCO, performance, flexibility, scalability
Prototyping: You will be working with state of art networking lab to prototype new network architectures
Industry influence: Participate in industry consortiums to shape standards, and influence vendor roadmaps
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Master’s or Doctoral degree in Electrical Engineering, Computer Engineering, or related fields and 10+ years of technical experience in the domain
Deep expertise with ethernet networking, RDMA (RoCE, Infiniband), congestion control, and layer 2/3 switching
Experience architecting scale-out/backend network for AI GPU clusters
Familiarity with scale-up networks such as NVLinks, UALink
Experience with high radix ethernet switches
Familiarity with AI model execution pipelines, being able to analyze communication flows and its impact on model performance
Prior contributions in standards committee and experience on hyperscale network deployments would be an added benefit
Skilled in partnering and influencing architects, hardware engineers, and software leads
Ability to manage through ambiguity, bringing clarity and results orientation to engage and energize collaborators and stakeholders
Collaboration skills, teamwork, and sense of presumed responsibility
Verbal and written communication skills, and ability to articulate and engage with both technical and non-technical stakeholders at all levels
Experience leading and driving complex projects with respect and integrity, including those with multiple workstreams spanning different business and technical disciplines
Intellectual curiosity and passion about learning and deploying new technologies
Problem-solving skills, analytical capabilities, and attention to details