This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Network Architect on the Cluster Architecture Team, you will work closely with the vendors, internal networking teams and industry peers to develop best-in-class interconnect architecture of the current and future generations of the Cerebras AI clusters. You will be responsible for developing proof-of-concept of new network designs and features enabling resilient and reliable network for AI workloads. The role will require cross-functional collaboration and interaction with diverse hardware components (e.g., network devices and the Wafer-Scale Engine) as well as software at several layers of the stack, from host-side networking to cluster-level coordination. The role also requires understanding of network monitoring systems and network debugging methodologies.
Job Responsibility:
Design AI/ML and HPC Clusters
Identify and address performance or efficiency bottlenecks, ensuring high resource utilization, low latency, and high throughput communication
Drive technical projects involving multiple teams, various software and hardware components coming together to realize advanced Networking technologies
Bring effective communication skills
Collaborate with vendors and industry peers to drive network hardware and feature roadmap
Represent Cerebras in industry forums
Central point of contact for any network reliability issues
Requirements:
Ph.D. in Computer Science or Electrical Engineering + 10 years industry experience or Master’s in CS or EE + 15 years industry experience
8+ Years of experience in large scale network designs in WAN or Datacenter
Extensive experience debugging networking issues in large distributed systems environment with multiple networking platforms and protocols
Experience of managing and leading multi-phase and multi-team projects
Networking platforms like Juniper, Arista, Cisco, Open box architectures (Sonic, FOBSS)
Networking protocols like RoCE, BGP, DCQCN, PFC, Streaming telemetry
Familiarity with automation languages like Python, or Go
Familiarity with Network visibility and management systems
What we offer:
Build a breakthrough AI platform beyond the constraints of the GPU
Publish and open source their cutting-edge AI research
Work on one of the fastest AI supercomputers in the world
Enjoy job stability with startup vitality
Our simple, non-corporate work culture that respects individual beliefs