This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Are you looking for an opportunity to shape the future of Artificial Intelligence (AI) infrastructure while building software and systems for some of the largest data centers ever created? Join the Azure Hyperscale Network organization—responsible for designing and building the software-defined physical network infrastructure that powers millions of servers globally for Azure, Bing, Microsoft cloud services, and AI Supercomputing. We’re at the forefront of large-scale cloud computing, managing one of the world’s largest datacenter network infrastructures. This isn’t just another engineering job—it’s your chance to join an innovative team working on cutting-edge AI infrastructure projects, building intelligent agents and scalable systems at a scale the industry has never seen. As a Senior Software Engineer on the Network Device Health team, you’ll design, build, deploy, and maintain large-scale distributed systems that collect network telemetry, verify network states, detect and alert on reliability issues, and rapidly localize and mitigate them. You’ll tackle complex technical challenges—from massive data pipelines to autonomous AI systems—that push your skills to new heights. We offer opportunities for innovation and career growth, with mentorship, real ownership, and endless learning. Whether you’re early in your career or you have several years of experience, your ideas will be valued and your impact immediate. If you’re excited to solve hard problems and push boundaries, come join us on this visionary journey.
Job Responsibility:
Collaborate with cross-functional stakeholders to define user requirements and translate them into intelligent, agentic software applications that autonomously reason, plan, and act across complex workflows
Design and implement scalable, production-grade AI systems that integrate generative AI capabilities—such as large language models (LLMs) and multimodal systems—into real-world applications to enhance user experiences and automate tasks
Drive the identification of technical dependencies and author design documents for services and platforms that support AI-driven application development
Create, optimize, debug, refactor, and reuse code to improve system performance, maintainability, and return on investment (ROI)
Act as a Designated Responsible Individual (DRI), guiding incident response and on-call operations to monitor, triage, and restore services during degradation or outages—ensuring high availability and reliability
Work closely with other software engineers and data scientists to support Azure’s production network and integrate AI-driven insights into infrastructure operations
Proactively explore emerging technologies, patterns, and tools to improve observability, efficiency, and performance at scale—driving consistency in monitoring and operational excellence across the engineering lifecycle
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, OR Java, JavaScript, or Python OR equivalent experience
2+ years of experience with distributed systems or networking
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter
Nice to have:
Bachelor's Degree in Computer Science OR related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, OR Python OR Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
1+ years experience with generative AI models (e.g., LLMs, diffusion models) and frameworks such as OpenAI, Hugging Face Transformers, LangChain, or Semantic Kernel
1+ years experience with agentic AI architectures, including planning, memory, tool use, and orchestration