This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a highly skilled systems engineer to architect and design scalable AI/HPC clusters with specific focus on rack and data center power delivery. This role involves evaluating and selecting compute, storage, networking, and power delivery components and solutions to optimize performance and reliability across global deployments. You will collaborate with cross-functional teams to deliver cutting-edge infrastructure for AI and high-performance computing workloads.
Job Responsibility:
Design scalable AI/HPC clusters including compute, storage, and networking with specific focus on power delivery
Evaluate and select CPUs, GPUs, accelerators, interconnects, and memory configurations for optimal cluster performance
Design leading-edge power delivery solutions for high-density AI/GPU deployments
Define power budgets, redundancy schemes, and fault tolerance mechanisms
Design network topologies to maximize overall cluster performance
Understand the network performance needs of different types of workloads
Understand advantages and performance trade-offs of network topologies for AI/HPC clusters
Design and optimize storage solutions to maximize AI/HPC cluster performance
Understand advantages and performance trade-offs of cluster storage solutions, e.g. Lustre, Ceph, etc.
Work across multiple organizations with subject matter experts from hardware, software, network, data center, and operations teams to deliver scalable, efficient, and reliable compute infrastructure
Requirements:
Experience in HPC, AI infrastructure, or data center systems engineering
Strong understanding of rack and data center power delivery
Knowledge of GPU/CPU architectures, PCIe, UALink, InfiniBand, and Ethernet networking
Familiarity with AI/ML frameworks and workload characteristics
Excellent problem-solving, communication, and documentation skills
Bachelor's or Master's degree in Electrical Engineering, Computer Engineering, Computer Science or related field
Nice to have:
Experience designing power delivery solutions for racks and data centers
Contributions to open-source HPC or AI infrastructure projects