This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Meta’s Core Infrastructure team seeks a Technical Program Manager (TPM) to lead complex, large-scale projects focused on advancing language model scaling. In this key position, you will collaborate across engineering, hardware, data center, research, and product teams to design, build, and scale foundational hardware, software systems, and tools that support Meta’s AI innovation. You will be responsible for driving the end-to-end integration of new AI hardware and core infra stack, from initial design validation of our software stack through production deployment. This includes developing and refining repeatable frameworks for efficient onboarding, ensuring robust and predictable execution, and proactively resolving technical and organizational challenges to maintain project momentum. You will use your problem-solving, technical acumen, and business insight to streamline onboarding of new AI hardware platforms into Meta’s suite of core infrastructure services. You will communicate transparently across all levels, motivate multidisciplinary teams, and champion best practices to deliver impactful outcomes that advance Meta’s infrastructure.
Job Responsibility:
Establish and lead effective program teams to ensure alignment and achieve common objectives
Work closely with engineering, data center, hardware and business stakeholders to define program requirements, prioritize initiatives, and establish scope, including shaping the roadmap and long-term strategy for partner teams
Create and implement communication strategies to proactively share program status, challenges, and risks with stakeholders
Drive successful outcomes by actively managing cross-functional dependencies, mitigating risks, and adjusting scope, timeline, and resources as needed
Collaborate with cross-functional teams to lead the end-to-end lifecycle of programs, including technical analysis, design, development, testing, implementation, and post-launch support
Establish and track key metrics, quality benchmarks, and performance indicators to drive accountability and ensure effective cross-functional execution of program deliverables
Anticipate and evaluate complex, long-term infrastructure challenges in close partnership with engineering leaders and key stakeholders
Drive product strategy to support and align with key company initiatives
Lead process improvements across internal and external teams, streamlining workflows and reducing manual effort through automation
Requirements:
Bachelor of Science in Electrical Engineering, Computer Science, Mechanical Engineering, or a related technical field, or equivalent experience
12+ years of experience in software engineering, hardware engineering, systems engineering, or technical product/program management
Knowledge of software and hardware development for large scale hardware readiness, including end-to-end product development processes
Excel at clearly communicating complex technical investments in a simple and understandable manner
Experience delivering complex technology programs and products from inception through to successful delivery
Knowledge of understanding user needs, gathering requirements, and defining project scope
Experience working under your own initiative, across multiple teams, demonstrating critical thinking and providing thought leadership in ambiguous spaces
Experience defining and optimizing engineering processes at scale
Excel at building cross-functional relationships, thrive amid complex challenges, excel at clearly communicating complex technical investments in a simple and understandable manner
Experience in analytical thinking and problem-solving for large-scale systems
Experience building work relationships across multi-disciplinary teams and with partners in different time zones
Experience defining strategic direction and identifying new opportunities for impact across products, platforms, and programs
Experience communicating at the executive level and influencing leadership and technical management teams to drive the development of systems, solutions, and products
Knowledge of Large Language Model and machine learning, and scaling distributed systems
Demonstrated experience of identifying new opportunities for the larger organization and influencing the appropriate stakeholders
Proven commitment to scale infrastructure for large scale AI distributed compute systems
Nice to have:
Knowledge of software and hardware development for large scale system readiness
Excel at clearly communicating complex technical investments in a simple and understandable manner