This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
AMD’s Global Cluster Engineering (GCE) team designs, validates, and deploys large-scale AI compute clusters that power some of the world’s most demanding workloads. We are hiring a Director of Technical Program Management to lead a portfolio of complex engineering, validation, and next-generation platform development spanning cluster architecture, deployment, lifecycle operations, and supply chain execution.
Job Responsibility:
Own a multi-year program portfolio for global cluster initiatives (new cluster builds, cluster validation and operational excellence), including critical milestones, dependencies, risk management, and executive reporting
Establish program governance (operating rhythms, QBRs, escalation paths, decision logs) across engineering, operations, finance, procurement, and suppliers
Lead end-to-end supply chain planning and execution for cluster infrastructure: server/GPU platforms, networking, storage, racks, power/cooling, spares, and long-lead components
Drive build readiness and NPI-style execution: BOM maturity, lead-time management, contract manufacturer alignment, and deployment sequencing
Partner with sourcing/procurement to optimize cost, availability, and resiliency across suppliers, balancing time-to-deploy with design and qualification constraints
Build and scale supply chain product automation for cluster delivery: forecasting, allocation, inventory visibility, exception management, and ETA/lead-time prediction
Own “product-like” delivery of internal platforms and tools (dashboards, APIs, workflow automation, digital-twin planning models) that improve supply chain decisions and reduce manual overhead
Define KPIs and data products for planning accuracy, schedule predictability, cost-to-serve, inventory health, and deployment velocity
Translate business and engineering objectives into executable program plans, including infrastructure requirements, capacity models, and deployment playbooks
Drive technical and operational trade-offs across performance, reliability, cost, availability, and schedule
communicate clearly to executives and stakeholders
Coordinate with datacenter/colo partners and internal facilities to ensure readiness for power, cooling, networking, security, and compliance
Establish lifecycle processes for cluster hardware: commissioning, firmware/software baselines, break/fix, RMA workflows, spares strategy, and refresh planning
Improve reliability and serviceability via standardization, automation, and closed-loop feedback (telemetry → root cause → supplier/engineering fixes)
Ensure operational rigor around change management, incident reviews, capacity planning, and continuous improvement
Requirements:
12+ years of experience in technical program management, engineering program leadership, infrastructure delivery, or adjacent roles
Proven track record delivering large-scale infrastructure programs (datacenter, cloud, AI/HPC clusters, platforms, or complex hardware/software systems)
Demonstrated experience partnering with supply chain organizations (procurement, sourcing, planning, manufacturing, logistics) and managing long-lead constraints and supplier dependencies
Strong program fundamentals: scope definition, critical path, integrated schedules, RAID management, executive communications, and stakeholder alignment in a matrix environment
Comfort with technical depth across compute platforms, networking/storage concepts, and operational tooling—enough to drive decisions and resolve ambiguity
Undergraduate degree is preferred
Applied Science Degree, PMP, and/or MBA are desired
Nice to have:
Experience building infrastructure automation systems (planning workflows, exception management, supply chain automation, inventory/ETA visibility, forecasting, allocation logic, or digital twin models)
Experience with supply chain product automation: owning internal tooling as a product (roadmap, requirements, adoption, metrics, iteration)
Familiarity with contract manufacturing, ODM/OEM ecosystems, qualification flows, and NPI for infrastructure platforms
Strong analytics orientation (SQL/BI tools, data pipelines, KPI design) and ability to translate data into decisions
Experience operating global programs spanning multiple regions, datacenters, and suppliers