CrawlJobs Logo

Senior Technical Program Manager – AI Infrastructure, Site Operations

cerebras.net Logo

Cerebras Systems

Location Icon

Location:
United States , Sunnyvale

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs. This Sr. TPM role owns site and data center operations programs supporting Cerebras’ AI Cloud and customer deployments. The position sits at Sunnyvale HQ and works closely with Hardware Engineering, Inference Engineering, and Operations leadership to ensure Cerebras systems are reliably deployed, operated, and scaled. This is a highly technical, execution-focused TPM role with strong emphasis on operational readiness, cross-functional coordination, and metrics/KPIs.

Job Responsibility:

  • Own end-to-end technical programs for data center and site operations
  • Act as single-threaded owner across: Hardware & Systems Engineering
  • AI Cloud Infrastructure & Operations
  • Network & Storage Engineering
  • Facilities, power, cooling, and colo partners
  • Drive site readiness for Cerebras Wafer-Scale Engine systems
  • Partner on installation, commissioning, change management, and break/fix workflows
  • Lead incident reviews and postmortems
  • ensure corrective actions are closed
  • Define and own operational metrics and KPIs, including: Availability and reliability
  • Incident rate, severity, MTTR / MTTD
  • Deployment readiness and time-to-service
  • Capacity and operational risk
  • Build executive-level dashboards and reporting
  • Establish program governance, risk tracking, and RACI clarity
  • Present program status, metrics, and operational risks to senior leadership

Requirements:

  • 8+ years in Technical Program Management, Infrastructure Ops, or Data Center Ops
  • Experience leading large, cross-functional infrastructure programs
  • Strong understanding of: Data center power and cooling fundamentals
  • Network and storage basics
  • Hardware-centric platforms
  • Proven ability to define and operationalize metrics
  • Strong written and executive-level communication skills

Nice to have:

  • AI/ML, HPC, or accelerator-based infrastructure
  • High-density and/or liquid-cooled data centers
  • Working with colocation providers and facilities teams
  • Incident management, reliability, or service operations background
What we offer:
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs

Additional Information:

Job Posted:
February 17, 2026

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior Technical Program Manager – AI Infrastructure, Site Operations

New

Technical Program Manager, AI Infrastructure

Be part of the team that builds and operates the world's fastest AI infrastructu...
Location
Location
United States , Sunnyvale
Salary
Salary:
Not provided
cerebras.net Logo
Cerebras Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience leading large, cross-functional infrastructure programs
  • Experience with AI/ML, HPC, or accelerator-based infrastructure
  • Strong understanding of data center power and cooling fundamentals
  • Experience installing and managing network, storage, and compute devices
  • Proven ability to define and operationalize metrics
  • Strong written and executive-level communication skills
  • Experience working with colocation providers and facilities teams
  • Background in incident management, reliability, or service operations
Job Responsibility
Job Responsibility
  • Own end-to-end technical programs for multiple data center buildouts, coordinating with partners, contractors, and internal teams
  • Drive facility site readiness for power and cooling for Cerebras Wafer-Scale Engine systems
  • Coordinate equipment delivery and manage vendor accountability for schedules and quality related to rack integration and inter-rack cabling
  • Act as the single-threaded owner across internal partners: Hardware & Systems Engineering, Network & Storage Engineering, AI Cloud Infrastructure & Operations
  • Enforce handover criteria between site completion, equipment deployment, and operations
  • Own overall schedule tracking, risk identification, and mitigation, creating clear visibility for leadership
  • Establish program governance, risk tracking, and RACI clarity
  • Present program status, metrics, and operational risks to senior leadership
  • Drive partner accountability on contractual milestones and commercial commitments
  • Document repeatable processes and implement them to scale across future data centers
What we offer
What we offer
  • Build a breakthrough AI platform beyond the constraints of the GPU
  • Publish and open source their cutting-edge AI research
  • Work on one of the fastest AI supercomputers in the world
  • Enjoy job stability with startup vitality
  • Our simple, non-corporate work culture that respects individual beliefs
Read More
Arrow Right

Senior Technical Program Manager - Datacenter Infrastructure

The Datacenter leasing Senior Technical Program Manager will be part of a team r...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Civil, Electrical, Mechanical, Telecom Engineering, or related technical field AND 4+ years’ experience in engineering, operations, commissioning or technical program management
  • 3+ years’ experience managing cross functional and/or cross-team projects
  • 3+ years of experience in data center design, infrastructure, and critical environments
  • Broad infrastructure knowledge across mechanical, electrical, and controls systems with a focus on Datacenter integration and performance
  • Familiarity with key industry standards and best practices, including ASHRAE, Uptime Institute, ANSI, and NFPA
  • Familiarity with high-density power and cooling solutions, sustainability initiatives, and emerging technologies for AI workloads
  • Ability to meet Microsoft, customer and/or government security screening requirements
Job Responsibility
Job Responsibility
  • Act as a Subject Matter Expert (SME) and provide global program support
  • Drive technical solutions for leased datacenters in partnership with Microsoft’s and Lessor’s core engineering teams
  • Evaluate lessor’s design proposal against technical requirements and mitigate non-compliance through technical and commercial solutions
  • Assesses lessor’s compliance through review of technical documents, site assessments, and stakeholder engagement
  • Partner with internal and external stakeholders during construction, RFS, and operations handover to unblock any technical issues risking the on-time delivery of Datacenter to customers
  • Drive cost impact analysis on non-compliance and specification changes. Escalate and provide visibility and feedback to leadership on cost drivers
  • Partner with Microsoft Engineering, Integration, Security, Operations, and Energy teams on resolution management
  • Drive partner accountability on contractual milestones and commercial commitments
  • Own overall schedule tracking, risk identification, blockers, and mitigation for the assigned projects
  • creating clear visibility for leadership
  • Fulltime
Read More
Arrow Right

Engineering Director

We are seeking a seasoned Engineering Director who thrives in challenging and fa...
Location
Location
Puerto Rico , Aguadilla
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Significant work experience as a director or similar position working across multiple stakeholder organizations, with at least 10+ years of people leadership experience specific to SW and Cloud engineering
  • Solid experience leading SW development across storage, networking, on-prem, and SaaS is a must
  • Experience in setting up geographically distributed sites
  • Must have a strong background in software development lifecycle including cloud infrastructure
  • Familiarity with agile methodologies and tools like JIRA
  • Prior experience in cloud product development and deployments
  • end to end ownership and accountability
  • Solid understanding of fundamental AI and machine learning concepts, including supervised and unsupervised learning, deep learning, reinforcement learning, natural language processing, computer vision, and statistical modeling
  • Extensive business acumen, technical knowledge, and industry experience encompassing one or more engineering, technology, and product domains
  • Demonstrated abilities to drive transformation across a business with exceptional skills in the management of change
Job Responsibility
Job Responsibility
  • Oversee the Puerto Rico Site daily operations, strategic planning and cross-functional team leadership for Hybrid Cloud
  • Recruit, mentor, and manage teams of AI/ML engineers, QA Engineers, Design Engineers and innovation specialists to deliver cutting-edge solutions
  • Continuously evaluate new tools, platforms, and frameworks in AI/ML to drive competitive advantage and operational efficiency
  • Ensure alignment with corporate goals while fostering a high-performance culture, operational efficiency, and employee engagement
  • Lead the development and execution of AI/ML strategies that align with business goals and drive innovation across products, services, or operations
  • Create strategic and tactical operations and resource plans, goals, and priorities for assigned organization based on business and technology roadmap and functional objectives
  • Engage with various senior leaders across the organization, program managers, R&D, support, Quality, product managers, technical leaders and executives to communicate program status, escalate issues, and guide and influence strategic decision-making
  • Manage senior relationships and escalated issues with outsourced partners and suppliers, including setting expectations regarding deliverables, product quality, schedules, and costs
  • ensures that organization is effectively leveraging outsourced resources
  • Identify opportunities for and drive organizational initiatives and programs to support business process improvements and cost reductions
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right
New

Senior Site Reliability Engineer

Zuora’s Cloud Engineering organization owns the reliability, scalability, and op...
Location
Location
India , Chennai
Salary
Salary:
Not provided
zuora.com Logo
Zuora
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of hands-on experience in Site Reliability Engineering, DevOps, or large-scale production operations
  • Advanced expertise in AWS, including architecture design across services such as EC2, EKS, VPC, IAM, RDS, S3, and CloudWatch
  • Deep experience with Infrastructure-as-Code using Terraform, including complex modules, state management, and governance
  • Strong programming and automation skills using Python and Shell
  • experience building production-grade automation systems
  • Expert-level Linux systems knowledge, including performance tuning, security hardening, and deep troubleshooting
  • Proven experience operating distributed systems and data streaming platforms such as Kafka in high-throughput environments
  • Demonstrated ability to work independently on complex, ambiguous problems with broad organizational impact
  • Proven technical leadership experience driving large, cross-team reliability or infrastructure initiatives, including setting technical direction, influencing design decisions, and mentoring engineers to deliver measurable outcomes at scale
  • Practical experience designing or implementing AI/ML-driven automation in operations, reliability, or platform engineering
Job Responsibility
Job Responsibility
  • Reliability Architecture & Platform Strategy: Own and evolve the reliability architecture of large-scale, distributed SaaS systems by defining SLOs, SLIs, error budgets, and resilience patterns aligned with business objectives
  • AI-Driven Automation & Intelligent Operations: Design, build, and operationalize AI-powered automation to reduce operational toil and improve system stability
  • Advanced Cloud & Infrastructure Engineering: Lead the design and operation of complex AWS-based infrastructure and Kubernetes platforms, optimizing for availability, security, and cost efficiency
  • Incident Leadership & Operational Excellence: Act as a technical leader during high-severity production incidents, driving structured response, decision-making, and recovery
  • Technical Leadership & Cross-Functional Influence: Influence reliability outcomes beyond the SRE team by partnering closely with Engineering, Product, and Security stakeholders
What we offer
What we offer
  • Competitive compensation, variable bonus and performance reward opportunities, and retirement programs
  • Medical Insurance
  • Generous, flexible time off
  • Paid holidays, “wellness” days and company wide end of year break
  • 6 months fully paid parental leave
  • Learning & Development stipend
  • Opportunities to volunteer and give back, including charitable donation match
  • Free resources and support for your mental wellbeing
  • Fulltime
Read More
Arrow Right

Senior Strategy Consultant

The Strategy Consulting team is responsible for providing our customers with str...
Location
Location
United States
Salary
Salary:
Not provided
optimizely.com Logo
Optimizely
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience in customer-facing roles in Analytics, digital customer experience, or conversion optimization
  • 4+ years of experience in Management consulting or in-house strategy for digital or technology firms
  • 4+ years of experience in SaaS, ideally for a customer experience technology
  • Ability to facilitate executive level discussions
  • Basic understanding of the MarTech landscape
  • Ability to explain strategy and concepts to executives, technical teams, practitioners, and other stakeholders
  • Ability to thrive in ambiguity and help define a structure
  • Proven communication and presentation skills
  • Experience in both qualitative and quantitative analysis, financial / business case models
  • Ability to understand, interact with, and advise various C-level stakeholders
Job Responsibility
Job Responsibility
  • Gain and maintain a deep level understanding of the Optimizely product suite to drive ongoing adoption of the platform and ensure high customer retention
  • Develop an excellent understanding of the customers’ goals and objectives, their digital strategy, and align what they want to achieve to Optimizely’s products
  • Work with our customers to develop a vision, implement organizational processes, and support change management within their organization to achieve a mindset shift towards experiment-led decision making
  • Support customers in adopting Optimizely’s AI capabilities through Opal, guiding them through onboarding, use-case identification, and operational integration to accelerate experimentation velocity and decision-making
  • Facilitate sessions and workshops with customer stakeholders, teaching them the goodness of hypothesis-based digital development
  • Use customer data to uncover their customer problems, while coaching them on what goes into a strong hypothesis
  • Validate and frame ideas as prioritized testable hypotheses (including but not limited to A/B/n tests, MVT, personalization campaigns, multi-armed bandits, and feature rollouts)
  • Maintain a mindset of continuous optimization process and instill that in your customers: building roadmaps, analysis, generating insights, and measuring business impact
  • Guide customers to understand the impact of their partnership with Optimizely, through ROI analyses and related business cases
  • Enable customers to successfully onboard and adopt Optimizely Analytics, helping them define measurement strategies, interpret results, and connect experimentation outcomes to business performance
Read More
Arrow Right

EMEA Lease Senior Mechanical Engineer

The Datacenter Engineering Mechanical Engineer will be part of the team responsi...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in Mechanical Engineering, or related field OR Master's Degree in Mechanical Engineering, or related field AND related technical engineering experience OR Bachelor's Degree in Mechanical Engineering, or related field AND related technical engineering experience OR equivalent experience
  • Registered Professional/Chartered Engineer in Mechanical Engineering
  • Experience as a lead engineer with design, construction, operation or maintenance of large scale and complex mechanical engineering systems (Data Centres, Pharma, Micro Electronics are three examples of relevant industries)
  • Candidate must possess strong written and verbal communication skills, attention to detail and maintain high quality standards
  • Detailed understanding of Datacenter mechanical design including but not limited to, HVAC, cooling systems, fire protection, fuel systems, plumbing systems, liquid cooling, BAS/BMS systems
  • Detailed understanding of availability and reliability, understanding sequence of operation studies and be able to extrapolate details in order to report to leadership team and provide technical evaluation for multiple stakeholders
  • Excellent knowledge of building codes and regulations
  • Experience in designing, operating, and good understanding of commissioning mechanical systems
  • Ability to lead a team of subject matter experts in mechanical engineering and operational processes and procedures as they relate to large data center environments
  • Knowledge of mechanical and control systems related to data center environments
Job Responsibility
Job Responsibility
  • Review and report on lease provider technical designs against Microsoft requirements
  • Ensure lease provider design compliance with Microsoft requirements
  • Perform mechanical design reviews
  • Assist in leading mechanical support staff
  • Understand and guide on complex data center mechanical systems, HVAC cooling systems, cooling equipment selection, plumbing systems, fire protection systems, fuel systems and mechanical Building Automation System (BAS) controls
  • Ensure lease providers adhere to Microsoft requirements
  • Perform technical reviews and analyses of leased and colocation data centers from external providers for compliance with Microsoft requirements
  • Design review mechanical infrastructure for Lease DC project buildouts, special projects, and retrofit upgrades
  • Assist in technical reviews addressing any issues during construction
  • Assess existing facilities to ensure compliance with Microsoft standards
  • Fulltime
Read More
Arrow Right
New

Class 2 Driver

To deliver products to customers within a specific area in a safe and friendly m...
Location
Location
United Kingdom , Aylesbury
Salary
Salary:
Not provided
webrecruit.co Logo
Webrecruit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Good safety practices and driving habits
  • Possess the correct driving license required for the vehicle
  • Good communication skills
  • Excellent Customer Service Skills
Job Responsibility
Job Responsibility
  • Prepare the vehicle by conducting operator maintenance, ensuring all products for a delivery are correct and that they have been loaded onto the vehicle correctly and safely
  • Complete delivery notes, return sheets and collect payments
  • Execute any special requests from customers by picking up and delivering items as directed on the delivery note
  • Maintain customer confidence by keeping information strictly confidential regarding premise security and operational information
  • Check the run sheets, picked orders for your deliveries and routing
  • Plan your route and requirements for the drop by studying the schedule and any requests by the customer that are on the delivery note
  • Make customers aware of any stock shortages or problems with their delivery
  • Return all empty cases and gas bottles and that they are correctly stored in the empties yard and unloaded in the correct manner
  • Ensure that correct health and safety procedures are adhered to
  • Complete all required checks and ensure that check sheets are passed to the distribution manager
What we offer
What we offer
  • Cycle to work scheme
  • Car Share
  • Overtime
  • Bonus
  • Employee discount
  • Fulltime
Read More
Arrow Right
New

Sales Representative

Avantor is looking for a dedicated and high energy Sales Representative to maint...
Location
Location
United States , Los Angeles
Salary
Salary:
59150.00 - 100740.00 USD / Year
avantorsciences.com Logo
Avantor
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • College degree or equivalent/applicable experience required
  • 2–5 years of success in complex sales environments, ideally involving long sales cycles and multiple decision-makers
  • Solid background in B2B sales, with the ability to build and maintain strong client relationships
  • Proficiency with CRM systems, using data to manage pipelines and optimize performance
  • Experience applying best-in-class sales methodologies, such as SPIN Selling, Challenger, or Miller Heiman
  • A valid driver’s license is required, as travel may be necessary to fulfill key responsibilities of the role
  • Willingness to travel to customer locations minimum 3 days a week
  • Collaboration Tool: Microsoft Office, specifically Teams with preference for experience in C4C or Qlikview
Job Responsibility
Job Responsibility
  • Sell consultatively by identifying customer needs, presenting tailored solutions, and recommending Avantor’s products and services
  • Leverage available resources to effectively implement company marketing plan, strategies and sales processes
  • Build and maintain strong customer relationships to drive satisfaction, loyalty, and long-term growth
  • Collaborate with sales leadership to develop and implement strategic territory plans to achieve sales goals
  • Manage a diverse product portfolio and align sales efforts with company-defined strategies
  • Prospect and acquire new customers by managing a sales pipeline and delivering compelling proposals that highlight Avantor’s value
  • Grow existing accounts by aligning solutions with evolving customer needs and retention strategies
  • Represent Avantor in the field and build strong partnerships with key manufacturers
  • Work closely with manufacturer reps to enhance product knowledge, secure competitive pricing, and improve account profitability
What we offer
What we offer
  • medical, dental, and vision coverage
  • wellness programs
  • health savings and flexible spending accounts
  • a 401(k) plan with company match
  • an employee stock purchase program
  • 11 paid holidays
  • accrue 18 PTO days annually
  • eligible for volunteer time off
  • 6 weeks of 100% paid parental leave
  • Fulltime
Read More
Arrow Right