CrawlJobs Logo

Datacenter Hardware Operations Lead

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

125600.00 - 228000.00 USD / Year

Job Description:

We are seeking a Datacenter Hardware Operations Lead focused on hardware and network operations, and logistics, with 15+ years of experience managing complex, mission-critical data center environments. You will oversee and optimize day-to-day hardware physical operations—including material movement, and hardware maintenance—across our expanding global footprint. This role includes designing scalable repair and logistics systems, managing vendors, and ensuring seamless coordination across facilities, supply chain, and engineering.

Job Responsibility:

  • Collaborate with internal and external teams to establish the hardware operations strategy, critical metrics and SLAs
  • Lead daily physical operations for data center campuses, from commissioning through ongoing maintenance
  • Design and implement robust logistics systems for material movement, repairs, and operational workflows
  • Collaborate with engineering, construction, supply chain, and operations teams to streamline processes and resolve bottlenecks
  • Develop tools and practices for improved traceability, throughput, and vendor coordination
  • Manage relationships with external partners and ensure alignment with operational goals
  • Apply best practices in logistics and operational planning to support scalable infrastructure growth

Requirements:

  • 15+ years of experience in physical operations and logistics for mission-critical infrastructure and data centers
  • Proven ability to manage complex logistics systems and coordinate across disciplines
  • Deep understanding of operational processes for large-scale facilities including maintenance, construction support, and warehousing
  • Experience leading cross-functional initiatives and working with third-party vendors
  • Bachelor's degree in Engineering, Logistics, Operations Management, or a related field (advanced certifications preferred)

Nice to have:

  • 20+ years of experience managing global-scale data center operations and logistics
  • Expertise in supply chain logistics and systems implementation for hyperscale environments
  • Strong leadership in dynamic, high-pressure settings with evolving technical needs
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided
  • Offers Equity
  • performance-related bonus(es) for eligible employees

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Datacenter Hardware Operations Lead

Datacenter Hardware Operations Technician, AI Compute Infrastructure - Stargate

OpenAI, in close collaboration with our capital partners, is embarking on a jour...
Location
Location
United States , Abilene, Texas
Salary
Salary:
86400.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in datacenter hardware operations, hardware engineering, or large-scale server maintenance
  • At least 2 years in a senior or lead technician capacity
  • Deep knowledge of high-density server hardware, including x86 platforms, GPUs, storage devices, and power/cooling systems
  • Excel at diagnosing hardware issues, coordinating complex repairs, and maintaining strong working relationships across organizations
  • Comfortable setting technical expectations and validating outcomes through collaboration, not direct management
  • Adapt quickly to changing operational conditions and enjoy solving problems at both the strategic and on-site levels
  • Communicate clearly and build trust across partner teams, vendors, and internal engineering stakeholders
  • Willing to be based full-time at a partner-operated campus
Job Responsibility
Job Responsibility
  • Serve as OpenAI’s primary on-site hardware contact, collaborating with Oracle teams and vendors to plan and coordinate maintenance, repairs, and lifecycle activities
  • Share technical requirements and verify that work performed supports OpenAI’s compute needs and agreed quality targets
  • Coordinate schedules, spare-parts planning, and issue escalation with partner teams to minimize downtime and keep operations running smoothly
  • Work with OpenAI fleet-health engineers to translate software-detected issues into on-site hardware actions in partnership with Oracle
  • Track hardware trends and provide joint recommendations with partner teams for design or operational improvements
  • Prepare documentation and runbooks that capture joint best practices and can be applied at additional campuses
  • Offer technical guidance and context to partner personnel while respecting their operational ownership
  • Collaborate with supply-chain teams to plan spares and manage hardware lifecycle activities
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

Senior Technical Program Manager

Microsoft’s Cloud Operations & Innovation (CO+I) organization powers the infrast...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree AND 4+ years experience in engineering, product/technical program management, data analysis, or product development OR equivalent experience
  • 2+ years of experience managing cross-functional and/or cross-team projects
  • Communicate complex technical and operational topics clearly and concisely to senior leaders
  • Drive end‑to‑end orchestration across upstream telemetry producers and downstream incident response and operations consumers
  • Facilitate cross‑team design discussions to align on workflows, validation criteria, handoffs, and governance models
  • 8+ years of experience in technical program management, engineering, or reliability/observability domains, preferably in the Datacenter Critical Environment space
  • Demonstrated ability to lead complex, multi‑team initiatives from concept to production in large‑scale environments
  • Ability to read and reason about technical documentation, schemas, APIs, and data models
  • Strong analytical and problem‑solving skills
  • comfortable working with metrics, dashboards, instrumentation, and system‑performance data
Job Responsibility
Job Responsibility
  • Lead delivery of RADAR’s mission by implementing and scaling sensor‑health detection, alerting, and triage capabilities across Microsoft datacenters
  • Design and operationalize core workflows for sensor‑health detection, alert routing, validation, and triage
  • Drive cross‑team orchestration by creating and strengthening relationships across engineering, hardware, operations, and service teams
  • Build and manage onboarding processes for new telemetry types and detection scenarios
  • Champion Process Excellence by maturing workflows, training partners, and driving adoption of consistent operating models
  • Lead partner alignment and influence to shape and deliver shared roadmaps across divisional boundaries
  • Identify gaps and opportunities through structured feedback loops
  • synthesize insights into clear problem statements
  • Manage schedules and execution across epics, sprints, semester plans, and releases
  • Produce clear technical documentation including specifications, decision records, runbooks, and operational procedures
  • Fulltime
Read More
Arrow Right

Data Center Technician

Summary: Remove network cables as part of a datacenter decommission Run/Label/St...
Location
Location
United States , Boydton
Salary
Salary:
18.00 - 22.00 USD / Hour
apexsystems.com Logo
Apex Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • High school diploma, GED, or equivalent
  • Basic knowledge of computer hardware, servers, and components
  • Basic understanding of how to use Microsoft Office applications (Outlook, Excel, Word)
  • Flexibility to work non-standard business hours that may include weekends and/or holidays
  • Associate's degree in computer programming or equivalent training required
  • 0-2 years experience required
  • Verbal and written communication skills, problem solving skills, customer service and interpersonal skills
  • Basic ability to work independently and manage one’s time
Job Responsibility
Job Responsibility
  • Monitor the system for equipment failure or errors in performance
  • Respond to program error messages by finding and correcting problems or terminating the program
  • Help programmers and systems analysts test and debug new programs
  • Operate spreadsheet programs and other types of software to load and manipulate data and to produce reports
  • Remove network cables as part of a datacenter decommission
  • Run/Label/Stage Power & Networking Cables
  • DBD removal only (does not perform any scanning)
  • Follows procedures for preparing, installing, performing diagnostics, troubleshooting, replacing, and/or decommissioning equipment with guidance from Datacenter Technician Leads
  • Prepares, stages, sets up, and performs basic startups and shutdowns for hardware following written instructions, checklists, guides, standard procedures, and with guidance from Datacenter Technician Leads
  • Physical audits of datacenter assets
What we offer
What we offer
  • Medical, dental, vision, life, disability, and other insurance plans
  • ESPP (employee stock purchase program)
  • 401K program with company match after 12 months
  • HSA (Health Savings Account on the HDHP plan)
  • SupportLinc Employee Assistance Program (EAP) with up to 8 free counseling sessions
  • Corporate discount savings program
  • On-demand training program
  • Access to certification prep and library of technical and leadership courses/books/seminars after 6 months tenure
  • Certification discounts and other perks (CompTIA, IIBA)
  • Dedicated customer service team
  • Fulltime
Read More
Arrow Right

Principal Engineer

Microsoft Silicon, Cloud Hardware, and Infrastructure Engineering (SCHIE) is the...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, OR related field AND 7+ years technical engineering experience OR Bachelor's Degree in Electrical Engineering, Computer Engineering, Mechanical Engineering, OR related field AND 8+ years technical engineering experience OR equivalent experience
  • 5+ years of experience of technical leadership as a platform or software architect or validation architect or a lead debug engineer or equivalent industry experience leadership position
  • Deep understanding of modern server or datacenter architectures or System on Chip features like virtualization technologies or major architectural blocks like Memory Controllers or Central Processing Units or Storage or Networking solutions for Cloud or Datacenter infrastructures
  • Experience leading technical deep dives into datacenter software solutions used in at scale environments or datacenter infrastructure and data systems, cloud native operating systems, or virtualization technologies
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Lead development and implementation of end to end debug solutions for @scale datacenter systems
  • Lead collaboration projects with hardware, firmware and software teams that drive root cause analysis
  • Accountable for successful execution of targeted defect reduction projects
  • Provide technical recommendations on at scale test content deployment technologies
  • Lead resolution of complex problems based on technical and business understanding
  • Develop world class at scale debug methodologies, test strategies and test routines in data center solutions
  • Solve problems relating to mission critical services and build automation to drive debug efficiency
  • Effectively communicate with partners and stakeholders for planning and progress on initiatives using data
  • Embody our culture and values
  • Fulltime
Read More
Arrow Right

Operations Program Manager, AI Infrastructure

OpenAI’s Hardware organization develops silicon and system-level solutions desig...
Location
Location
United States , San Francisco
Salary
Salary:
177000.00 - 285000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in Operations, Engineering, Program Management, or equivalent, within hardware development, manufacturing, or supply chain domains (compute, networking, datacenter, or similarly complex systems)
  • Proven track record leading complex hardware NPI programs end-to-end, from early bring-up through production ramp
  • Strong understanding of manufacturing and supply chain fundamentals, including BOM management, ECO/MCO processes, build readiness, factory test, quality controls, and material planning
  • Demonstrated ability to lead cross-functional teams, influence senior stakeholders, and drive decisions in ambiguous, time-compressed environments
  • Exceptional written and verbal communication skills, with the ability to distill complex issues for executive and external audiences
Job Responsibility
Job Responsibility
  • Act as the single-threaded owner for operational readiness across NPI and ramp, accountable for outcomes from early bring-up through sustained production
  • Translate OpenAI’s infrastructure strategy and engineering objectives into clear operating plans, execution priorities, and decision frameworks
  • Drive alignment across Engineering, Operations, Strategic Sourcing, Finance, Capacity Planning, and Executive stakeholders by framing tradeoffs, risks, and recommendations
  • Proactively identify inflection points where decisions or investments are required to protect long-term scale, reliability, or cost targets
  • Influence operational strategy with manufacturing partners by setting expectations on execution rigor, accountability, and continuous improvement
  • Drive overall NPI build readiness, including material accountability, manufacturing and test readiness, product data availability, factory infrastructure, and qualification plans
  • Lead transition activities from NPI to mass production, partnering closely with Sustaining Operations teams to ensure seamless ownership transfer
  • Translate engineering requirements into actionable, factory-ready plans with tier-1 manufacturing and integration partners
  • Lead cross-functional build and debug cadences
  • ensure issues are clearly owned, aggressively driven, and formally closed with root cause and prevention
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

Distinguished Engineer - Networking

GEICO is seeking an experienced Distinguished Engineer with a passion for buildi...
Location
Location
United States , Chevy Chase
Salary
Salary:
130000.00 - 300000.00 USD / Year
geico.com Logo
Geico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Fluency and specialization in software development and best practices using programming languages such as Golang and Python
  • Understanding of datacenter and LAN/WAN network designs with a focus on underlay networks and physical infrastructure
  • Understanding of operating systems and how they interface with hardware
  • Understanding of datacenter facilities, lifecycle and urbanization
  • Understanding of SQL and NoSQL databases, including stateful services management and storage
  • Understanding of networking, caches, key/value stores, load balancing, global load balancing, queues, DNS and CDN
  • Primary Focus on managing infrastructure through code
  • Deep knowledge of SRE practices, methodologies, and principles, along with a solid understanding of on-prem and public cloud-based network, compute, and storage technologies
  • In-depth knowledge of hybrid cloud architecture, IaaS and PaaS technologies, container orchestration platforms (e.g., Kubernetes), cloud efficiency and observability etc.
  • Strong background in incident management
Job Responsibility
Job Responsibility
  • Provide thought leadership in datacenter reliability for networks and servers, staying ahead of industry trends and emerging technologies
  • Conduct comprehensive risk assessments to identify potential threats and vulnerabilities
  • Design and implement robust strategies to ensure maintainability and observability of our hardware and operating system assets
  • Lead the design and architecture of resilient and scalable systems, considering both on-premises and cloud-based solutions
  • Collaborate with cross-functional teams to integrate GEICO best practices into the development and deployment processes
  • Develop and maintain comprehensive incident response plans to address various disaster scenarios on our networking and datacenters
  • Conduct regular simulations and drills to ensure the readiness of the organization in the event of a disaster
  • Hands-on software engineering and SDLC best practices (Technical Review Documents, Architecture, Software Development, Software Reviews, Testing, Production Readiness Reviews, among others)
  • Evaluate, select, and implement cutting-edge technologies and tools to enhance our datacenter capabilities including but not limited to processes, compliance, and visibility
  • Stay current with industry best practices and emerging technologies to continuously improve our network and datacenter capabilities
What we offer
What we offer
  • Comprehensive Total Rewards program that offers personalized coverage tailor-made for you and your family’s overall well-being
  • Financial benefits including market-competitive compensation
  • a 401K savings plan vested from day one that offers a 6% match
  • performance and recognition-based incentives
  • and tuition assistance
  • Access to additional benefits like mental healthcare as well as fertility and adoption assistance
  • Supports flexibility- We provide workplace flexibility as well as our GEICO Flex program, which offers the ability to work from anywhere in the US for up to four weeks per year
  • Fulltime
Read More
Arrow Right

Mechanical Engineer

Microsoft’s Cloud Operations & Innovation (CO+I) is the engine that powers our c...
Location
Location
Greece , Athens
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Extensive experience related technical engineering experience OR Bachelor's degree in Mechanical Engineering, or related field AND related technical engineering experience OR Master's degree in Mechanical Engineering, or related field AND related technical engineering experience OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Ability to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Act as an operations Subject Matter Expert (SME) with primary focus on Mechanical systems- from Cooling to Water treatment facilities
  • Inspect and oversee critical environment facility equipment (e.g., electrical, HVAC, mechanical, and control systems), buildings, and grounds to identify unsafe or abnormal conditions
  • Serve as the technical authority for on-site operations of large-scale electrical power distribution, control systems, and designs
  • Lead project execution, owning technical solutions, vendor delivery, performance management, and proactive risk identification
  • Own Maintenance and Repair contractual scope from your vendors
  • Actively contribute to operational budget planning and CAPEX (projects)
  • Use the Computer Maintenance Management System (CMMS) to track equipment assets and execute maintenance work orders
  • Develop, enhance, and maintain operational procedures such as EOPs, MOPs, and SOPs
  • Perform planned, predictive, and corrective maintenance following approved procedures across mechanical and related CE systems
  • Collaborate with suppliers and maintenance vendors to ensure spare parts strategy and availability
  • Fulltime
Read More
Arrow Right

Senior Mechanical Engineer

Microsoft Cloud Operations and Innovation (CO&I) is the team behind the cloud. W...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in Mechanical Engineering, or related field OR Master's Degree in Mechanical Engineering, or related field AND 3+ years related technical engineering experience OR Bachelor's Degree in Mechanical Engineering, or related field AND 4+ years related technical engineering experience OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Ability to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
  • Doctorate in Mechanical Engineering, or related field AND 3+ years related technical engineering experience OR Master's Degree in Mechanical Engineering, or related field AND 6+ years related technical engineering experience OR Bachelor's Degree in Mechanical Engineering, or related field AND 8+ years related technical engineering experience OR equivalent experience
  • 8+ years of technical experience designing complex mechanical systems
  • 8+ years of datacenter design experience in a consulting role
  • 6 years as lead designer focused on datacenter projects
  • Experience with designing cooling systems for liquid-cooled IT hardware
  • Experience designing mechanical HVAC, plumbing, fire protection, fuel oil, and complex sequences of operations for mechanical systems for implementation in building automation systems
  • Ability to lead a team of engineers in advanced engineering and operational processes and procedures as they relate to hyperscale datacenter environments
Job Responsibility
Job Responsibility
  • Understanding of datacenter HVAC cooling systems, cooling equipment selection, plumbing systems, fire protection systems, fuel systems and mechanical Building Automation System (BAS) controls
  • Direct design decisions related to datacenter projects including new construction, expansions, retrofits, and upgrades
  • Review construction drawings, specifications, design guides/standards, sequences of operation, and commissioning documents for technical compliance
  • Engage with and manage external design consultants and run design workshops and workgroups to ensure they deliver a design from Basis of Design (BOD) to Issue for Construction (IFC) and through the Construction Administration phase of the datacenter to our Microsoft design standards
  • Collaborate with Global Technical Governance team, Global Strategy team, and DCE Operations team to develop datacenter designs, first of kind pilots and prototypes, and maintain the design standard documents (drawings, specifications, equipment technical submittals, etc.) to safe, cost effective, quality builds
  • Lead the delivery of mechanical system design for first-of-kind datacenter prototypes
  • Travel to sites for “Factory Witness Testing”, equipment and engineering vendor interaction in the design process, as well as final commissioning of the installed systems
  • Travel to datacenter projects for engineering audits and reviews
  • Fulltime
Read More
Arrow Right