CrawlJobs Logo

Infrastructure Hardware Technical Program Manager (Server And Network Systems)

United States; Canada, Sunnyvale · Job Posted May 29, 2026
Apply Position
Job Link Share

Job Description

As an Infrastructure Hardware Technical Program Manager (Server and Network Systems) on the Cluster Architecture Team, you will drive end-to-end delivery of server and network platform programs across Cerebras CS-3–based AI clusters — from requirements and vendor selection through lab bring-up, qualification, and production rollout. You will be the execution owner for multi-team programs spanning OEM/ODM partners, component vendors, internal software/runtime teams and architects, validation/QA, and deployment/operations. This role is intentionally technical: you must understand server, network, and system-level trade-offs well enough to run effective technical reviews, keep programs grounded in real constraints, and maintain a crisp decision trail - while partnering closely with the Compute / Server / Network Platform Architects for detailed technical direction and sign-off. You will also build shared understanding with our rack/elevations and physical datacenter design partners so that server and network changes land smoothly in real deployments (without owning physical DC design).

Job Responsibility

  • Own end-to-end program execution for server systems and network equipment in Cerebras clusters, including new platforms, refreshes, and major component/config changes
  • Drive requirements gathering and convert inputs into executable plans with clear milestones, readiness gates, and cross-functional deliverables
  • Represent Cluster Architecture in executive reviews, OKR cycles, and leadership/customer forums as needed
  • Build and manage integrated schedules across vendors and internal teams, track dependencies, critical path, and risks
  • Manage OEM/ODM and switch/vendor engagements (RFI/RFP, samples, escalations, roadmap alignment)
  • Partner with Compute / Server Platform / Network Architects to turn architectural decisions into qualification plans, acceptance criteria, and rollout strategies
  • Lead qualification and release readiness (lab/staging validation, regression tracking, go/no-go decisions)
  • Own risk and change management into production, including versioning, rollout sequencing, and stakeholder communication
  • Ensure operational readiness with deployment and fleet teams and maintain alignment with rack/physical DC owners on power, cooling, space, and cabling constraints

Requirements

  • B.S. or M.S. in Computer Science, Electrical/Computer Engineering, or equivalent experience
  • 8+ years in Technical Program Management (or similar delivery leadership) for server, network, or infrastructure platforms from concept through production
  • Experience coordinating complex server and/or datacenter network programs across OEM/ODMs, switch vendors, and internal engineering teams
  • Working knowledge of server architecture (CPU/NUMA, memory bandwidth, PCIe, NIC and storage IO) and enough networking fundamentals (leaf-spine fabrics, switch platforms, high-performance interconnects) to run effective technical reviews
  • Familiarity with Linux server fleet management (provisioning, firmware/BIOS, drivers, field triage)
  • Strong multi-team program execution skills: integrated plans, risk management, dependency tracking, and executive-level communication
  • Ability to operate in ambiguity and keep parallel server and network workstreams aligned

Nice to have

Experience with AI/ML, HPC, or performance-sensitive distributed infrastructure is a plus

What we offer

  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Infrastructure Hardware Technical Program Manager (Server And Network Systems)

8 matching positions

Technical Program Manager- AI Cluster Validation

We are seeking a Technical Program Manager to lead execution of AI cluster engin...
Location
Location
United States , Austin
Salary
Salary:
162640.00 - 243960.00 USD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience leading complex hardware or AI infrastructure programs with ownership across bring-up, validation, and deployment phases
  • Strong technical understanding of GPU-based AI systems, rack architectures, and datacenter infrastructure
  • Proven ability to manage ambiguity, drive debug execution, and lead cross-functional teams without direct authority
  • Strong written and verbal communication skills, including executive-level status reporting
  • Proficiency with program management and execution tools (Jira, Confluence, dashboards, Excel/PowerPoint)
  • Bachelor's or master's degree in systems, EE, CS, or related engineering discipline
  • PMP, Scrum Master, or equivalent program management training
Job Responsibility
Job Responsibility
  • Define, plan, and drive program plans for AI infrastructure systems validation and readiness, including server integration, rack bring-up, and cluster-scale deployment readiness
  • Create and maintain core PM artifacts: schedules, dependency maps, resource forecasts, risk/issue logs, and program dashboards/status reports
  • Identify and drive mitigation plans for issues/risks, including cross-team escalations and corrective actions across multiple engineering areas
  • Drive regular execution reviews with engineering teams and provide concise, data-driven updates to senior leadership
  • Own program execution for GPU-based AI platforms, spanning system bring-up, qualification, scale readiness, and deployment validation across server, rack, and cluster levels
  • Drive alignment across GPU, CPU, firmware, BIOS/BMC, and system teams to ensure readiness for scale testing and customer workloads
  • Track platform issues, and debug dependencies
  • ensure risks are clearly documented, owned, and mitigated
  • Own program planning and execution for multi-node and multi-rack scale testing, including test strategy, scheduling, coverage tracking, and readiness gates
  • Lead end-to-end delivery of rack-level AI solutions, including compute trays, switch trays, cabling, power, cooling, and management infrastructure
  • Fulltime
Read More
Arrow Right

Datacenter Program Manager New Product Integration

As a Datacenter Program Manager New Product Integration, you will lead complex i...
Location
Location
United States , West Des Moines
Salary
Salary:
102600.00 - 202800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • High School Qualification or equivalent AND 3+ years experience supporting IT equipment or related technology or delivering server and network deployment projects in large-scale environments
  • OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Drive the integration of new hardware and complex systems into mission‑critical datacenter environments, from requirements assessment through execution and operational readiness
  • Assess existing datacenter infrastructure and component dependencies to determine integration requirements for new technology deployments
  • Define integration strategies, methods, sequencing, and readiness criteria aligned with DCO deployment and change governance expectations
  • Leverage existing infrastructure, platforms, and standard solutions to reduce cost, minimize operational risk, and improve delivery efficiency
  • Own integration planning artifacts, including dependency tracking, execution plans, and milestone alignment
  • Partner with engineering, deployment, operations, and vendor teams to troubleshoot and resolve issues encountered during new integrations
  • Validate that completed integrations meet defined technical, operational, safety, and business success criteria prior to operational handoff
  • Identify integration risks and constraints early and drive mitigation plans across stakeholders
  • Enable operational readiness by educating DCO teams and partners on system and hardware integration procedures
  • Create and maintain high‑quality technical and program documentation to support execution, auditability, and long‑term sustainment.
  • Fulltime
Read More
Arrow Right

NPI Program Manager

Meta’s Infrastructure organization is seeking an experienced New Product Introdu...
Location
Location
United States , Fremont
Salary
Salary:
140000.00 - 198000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in a directly related field, or equivalent practical experience
  • 6+ years of relevant work experience with in computer hardware and/or networking manufacturing, shop floor control, supply chain execution, manufacturing testing and product lifecycle management tools and processes
  • Experience in NPI product life-cycle, program management including program kickoff, stakeholder influence & strategy, workstream prioritization, driving cross-functional execution and establishing working relationships across multi-disciplinary teams
  • Technical background in engineering or manufacturing in AI, compute, or networking space
  • Effective demonstrated skill to adapt, take initiative and thrive in ambiguity
  • Knowledge of manufacturing-specific roles & responsibilities throughout NPI phases including HW, SW, manufacturing, product, process, test & quality engineering, reliability, and quality assurance
  • Knowledge of supply chain-specific roles & responsibilities throughout NPI phases including procurement, planning, product lifecycle management, materials management, demand and supply planning, and logistics
  • Knowledge of manufacturing test processes & systems, including quality inspections, automated server, storage and integrated rack testing
  • Knowledge of PLM concepts including BOMs, ECO / MCO, deviations, part substitutions, effectivity date management, quality management systems
  • Proven communication and presentation skills
Job Responsibility
Job Responsibility
  • As single-threaded owner for Meta’s Supply Chain organization, apply hands-on program management skills and techniques to drive overall schedule, build & test readiness, factory readiness, material readiness, and risk management, internal and external escalations and program communications to achieve on-time and on-quality execution of NPI builds for AI compute, storage and network hardware programs
  • Act as primary liaison between Meta’s Supply Chain organization and all internal and external partners for covered programs
  • Drive build readiness including all Supply Chain deliverables and daily build issue debug cadence across Meta’s Supply Chain and internal and external engineering and manufacturing partners
  • Drive manufacturing, test and material readiness, infrastructure
  • Own program risk identification, mitigation, management and communication across Meta’s Supply Chain organization
  • Own overall program communication strategy and execution across Meta’s supply chain organization and internal and external partners
  • Drive factory ramp readiness with internal and external engineering and manufacturing partners
  • Own and drive program escalations across Meta internal and external teams and partners
  • Identify operational pain points and improvement opportunities across internal and external processes and workflows and drive improvements across multiple teams and functions
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

NPI Program Manager

As Meta’s datacenter infrastructure grows rapidly in scope and scale the increas...
Location
Location
United States , Fremont
Salary
Salary:
168000.00 - 234000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in a directly related field, or equivalent practical experience
  • 8+ years of relevant work experience with in computer hardware and/or networking manufacturing, shop floor control, supply chain execution, manufacturing testing and product lifecycle management tools and processes
  • Experience in NPI product life-cycle, program management including program kickoff, stakeholder influence & strategy, workstream prioritization, driving cross-functional execution and establishing working relationships across multi-disciplinary teams
  • Technical background in engineering or manufacturing in AI, compute, or networking space
  • Effective demonstrated skill to adapt, take initiative and thrive in ambiguity
  • Knowledge of manufacturing-specific roles & responsibilities throughout NPI phases including HW, SW, manufacturing, product, process, test & quality engineering, reliability, and quality assurance
  • Knowledge of supply chain-specific roles & responsibilities throughout NPI phases including procurement, planning, product lifecycle management, materials management, demand and supply planning, and logistics
  • Knowledge of manufacturing test processes & systems, including quality inspections, automated server, storage and integrated rack testing
  • Knowledge of PLM concepts including BOMs, ECO / MCO, deviations, part substitutions, effectivity date management, quality management systems
  • Proven communication and presentation skills
Job Responsibility
Job Responsibility
  • As single-threaded owner for Meta’s Supply Chain organization, apply hands-on program management skills and techniques to drive overall schedule, build & test readiness, factory readiness, material readiness, and risk management, internal and external escalations and program communications to achieve on-time and on-quality execution of NPI builds for AI compute, storage and network hardware programs
  • Act as primary liaison between Meta’s Supply Chain organization and all internal and external partners for covered programs
  • Drive build readiness including all Supply Chain deliverables and daily build issue debug cadence across Meta’s Supply Chain and internal and external engineering and manufacturing partners
  • Drive manufacturing, test and material readiness, infrastructure
  • Own program risk identification, mitigation, management and communication across Meta’s Supply Chain organization
  • Own overall program communication strategy and execution across Meta’s supply chain organization and internal and external partners
  • Drive factory ramp readiness with internal and external engineering and manufacturing partners
  • Own and drive program escalations across Meta internal and external teams and partners
  • Identify operational pain points and improvement opportunities across internal and external processes and workflows and drive improvements across multiple teams and functions
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Infrastructure Project Manager

We are seeking an experienced Infrastructure Project Manager to lead and deliver...
Location
Location
United States , Hollywood
Salary
Salary:
Not provided
https://www.roberthalf.com Logo
Robert Half
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with cloud migrations
  • Data center refreshes
  • Network upgrades
  • Security initiatives
  • M365/identity modernization
  • Cloud Platforms: Azure, AWS, hybrid cloud
  • Enterprise Infrastructure: Windows Server, Active Directory, VMware, Linux
  • Networking: WAN, LAN, SD-WAN, VPNs, firewalls
  • Collaboration & Identity: Microsoft 365, Exchange, Entra ID (Azure AD), SSO, MFA
Job Responsibility
Job Responsibility
  • Lead end-to-end delivery of IT infrastructure projects including cloud and data center migrations, network, WAN, and SD-WAN deployments, server, storage, and virtualization upgrades, Microsoft 365, Exchange, and identity modernization, disaster recovery and business continuity programs
  • develop and manage project plans, schedules, budgets, and resource allocations, cutover, go-live, rollback, and risk mitigation plans, dependency tracking across applications, networks, and security teams
  • coordinate cross-functional technical teams including network engineers, systems and cloud engineers, security and compliance teams, application owners and business units
  • manage third-party vendors and MSPs including cloud providers, systems integrators, ISPs, and hardware vendors
  • ensure projects align with security, compliance, and governance standards including SOC, DR, change management, and audit requirements
  • provide regular executive-level status reporting, risk tracking, and milestone updates
  • drive continuous improvement in project delivery, documentation, and operational readiness
What we offer
What we offer
  • medical
  • vision
  • dental
  • life and disability insurance
  • 401(k) plan
  • Fulltime
Read More
Arrow Right

Technical Account Manager

As a Technical Account Manager at Axon, you will be the primary point of contact...
Location
Location
Australia , Melbourne
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Australian Citizenship (required for working with sensitive government data
  • must pass security clearance)
  • 5+ years of IT experience in a support or deployment role
  • Experience working with law enforcement and/or government entities
  • Proven track record of managing customer relationships and technical projects successfully
  • Ability to work autonomously to meet objectives with minimal oversight
  • Robust IT background, with expertise in: Software image creation and maintenance
  • Routing, switching methodologies, Wi-Fi, telecommunications, and Internet technologies
  • Microsoft Server & Client operating systems, Microsoft SQL Server, Active Directory, Azure (Entra ID)
  • Network Administration (TCP/IP, DHCP, DNS, SSH, Firewalls)
Job Responsibility
Job Responsibility
  • Achieve expertise in Axon technologies, including Axon Evidence, body cameras, and Fleet system
  • Serve as the primary technical liaison between Axon and the customers
  • Participate in operational and technical meetings, ensuring effective communication and collaboration
  • Build and maintain an internal Axon network to support both the customer-facing Axon team and the customer Project Team
  • Provide field support, including setup and configuration of Axon hardware such as TASER and BWC docking stations
  • Assist the customer in inventory management of Axon devices
  • Monitor support tickets, provide technical troubleshooting (tier 2 level support) and escalate when necessary
  • Ensure Service Level Requirements (SLRs) and contractual obligations are met
  • Assist the Program Manager customer and/or professional services team by providing requested materials, information, and Voice of Customer (VOC) documentation
  • Communicate customer feedback across Axon teams and collaborate to drive product improvements
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Sre

Hybrid: This role is categorized as hybrid and is expected to report to Austin ...
Location
Location
United States , Austin; Warren
Salary
Salary:
Not provided
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in computer science or a related field, or equivalent work experience
  • 7-10 years software experience with strong proficiency in PostgreSQL and at least one other (Oracle, SQL Server) database technologies
  • Proficiency in at least one programming language (e.g., Python, Go, Java) and familiarity with multiple language ecosystems
  • Solid understanding of operating systems, networking, distributed systems, databases, and storage architectures
  • Deep understanding of how code runs on underlying hardware, including operating systems, algorithms, and data structures
  • Ability to optimize or troubleshoot code by understanding its execution and the impact on system resources
  • Experience handling production incidents, including root cause analysis, mitigation, and working through complex system failures
  • Strong communication skills, with an ability to explain technical concepts to both engineering and business stakeholders
  • Commitment to collaborative problem-solving and shared ownership of services
  • Proven experience in automating manual processes, building deployment pipelines, or managing configuration systems
Job Responsibility
Job Responsibility
  • Develop tools and software to automate operational processes, improve system reliability, and reduce manual intervention
  • Lead, Implement and improve monitoring and observability frameworks, enabling proactive detection and resolution of incidents
  • Participate in an on-call rotation to diagnose, troubleshoot, and mitigate production incidents, ensuring minimal downtime and swift resolution
  • Work alongside developers to ensure the quality, scalability, and reliability of our database services
  • Practice shared ownership of services in production, fostering a "You build it, you run it" culture
  • Manage Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) to manage reliability expectations effectively
  • Conduct deep-dive analyses of incidents and collaborate on post-incident reviews to derive learnings and prevent recurrence
  • Champion a culture of continuous improvement
  • Evaluate system performance and advocate for optimizations that reduce infrastructure costs while maintaining service reliability
  • Fulltime
Read More
Arrow Right

Information Systems Security Officer 2 (Forecasted)

The Information System Security Officer (ISSO) supports the cybersecurity and in...
Location
Location
USA , Annapolis Junction
Salary
Salary:
142000.00 - 240000.00 USD / Year
ctp-web.com Logo
Columbia Technology Partners
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Ten (10) years of relevant experience
  • Experience must include at least two (2) of the following areas: Current security tools
  • Hardware/software security implementation
  • Communication protocols
  • Encryption techniques/tools
  • Bachelor's degree in Computer Science or a related discipline from an accredited institution, or four (4) additional years of experience in lieu of a degree
  • Must meet DoD 8570 Information Assurance Management (IAM) Level I or higher compliance
  • Understanding of information assurance principles, NIST RMF processes, and security authorization practices
  • Ability to manage security posture, perform risk assessments, and support system authorization activities
  • Strong communication and coordination skills for working with technical teams and leadership
Job Responsibility
Job Responsibility
  • Support senior ISSOs in implementing and enforcing information security policies, procedures, and methodologies
  • Assist in preparing, reviewing, and maintaining cybersecurity documentation, including System Security Plans (SSPs), Risk Assessment Reports, Certification & Accreditation (C&A) packages, and System Requirements Traceability Matrices (SRTMs)
  • Evaluate security solutions to ensure they meet requirements for processing classified information
  • Support and conduct vulnerability and risk assessment activities in alignment with security authorization requirements
  • Provide Configuration Management (CM) for security‑relevant software, hardware, and firmware, including assessing the security impact of system changes
  • Maintain detailed records of network devices and system components such as workstations, servers, routers, firewalls, switches, and related infrastructure
  • Develop and maintain cybersecurity authorization documentation in accordance with ODNI, DoD, and NIST RMF guidance
  • Ensure compliance with system security policies and maintain the overall cybersecurity posture of assigned systems
  • Support the ISSM with operational cybersecurity responsibilities for systems, programs, or enclaves
  • Update security plans and other required cybersecurity documentation
What we offer
What we offer
  • Medical: CTP offers 3 superior plans, bringing our employees both in-network and out-of-network options
  • Vision + Dental: Both free to you + paid in full by CTP
  • Retirement: 401k - 6% company contribution
  • PTO + Leave: A work life balance is extremely important to our team here at CTP, which is why our paid time off plans are so lucrative. Offering customizable leave plans to meet your needs is just one of our many perks! Jury Duty, Bereavement + Military Leave provided
  • Career Growth: Up to $10,000 provided for approved career-related learning, training, education, and/or tuition
  • Life and AD&D Insurance/Short-Term & Long-Term Disability: More peace of mind, at zero cost to you
  • Profit Sharing Bonus: End of year cash gets added to your bottom-line
  • Referral Bonus Program: Our tiered program provides an incentive with each stage of the hiring process your referral passes. Our bonuses range from $7,000-$20,000, if your referral joins the team
  • Fulltime
Read More
Arrow Right