This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The AI Systems Architect will work on all aspects of inference and training systems focusing on system design, data center planning and modeling of workloads on multiple GPU SKUs. The work will entail reliability modeling, lifecycle modeling and analysis of workloads and GPUs, GPU planning, analytical design of systems and workload assignment. The successful candidate will focus on deep LLM modeling and disseminating results across cross functional orgs towards better understanding of software-hardware codesign features. The candidate must be able to demonstrate deep knowledge of AI systems and architectures for both training and inference SKUs across mult-vendor and multi-generational GPUs and models.
Job Responsibility:
Partners with appropriate stakeholders to determine user requirements for a set of scenarios
Leads identification of dependencies and the development of design documents for a product, application, service, or platform
Leads by example and mentors others to produce extensible and maintainable code used across products
Leverages subject-matter expertise of cross-product features with appropriate stakeholders (e.g., project managers) to drive multiple group's project plans, release plans, and work items
Holds accountability as a Designated Responsible Individual (DRI), mentoring engineers across products/solutions, working on-call to monitor system/product/service for degradation, downtime, or interruptions
Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale and shares knowledge with other engineers
Requirements:
Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check
Nice to have:
Master's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR Bachelor's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python