This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Crusoe's mission is to accelerate the abundance of energy and intelligence. We’re crafting the engine that powers a world where people can create ambitiously with AI — without sacrificing scale, speed, or sustainability. Be a part of the AI revolution with sustainable technology at Crusoe. Here, you'll drive meaningful innovation, make a tangible impact, and join a team that’s setting the pace for responsible, transformative cloud infrastructure. About the Role: Once a Spark (Modular Data Center) is deployed, it must stay up. We are seeking a Manager, Field Ops to build the operational capability for maintaining a geodiverse fleet of high-density Spark AI data centers. You are the leader who ensures that our assets perform with Tier 3 reliability.
Job Responsibility:
Availability Management: Accountable for "Uptime" and "Time to Repair" metrics across the Spark fleet. Develop the “playbook” and SOPs to ensure that each Spark unit is deployed, powered & commissioned successfully
Maintenance Scheduling: Create and own the maintenance and SOPs playbook of deployed Spark units. Develop and oversee preventative maintenance for electrical distribution, HVAC, and fire suppression systems
SLA Enforcement: Ensure our managed services customers receive the reliability and performance guaranteed in our service agreements
DIG & Spark Interface: Act as the Spark field operations interface with Crusoe’s DIG team, to incorporate DIG best practices and standardize Spark operations and drive alignment with DIG to simplify operations and maintenance protocols across Spark deployments
Liquid Cooling Maintenance Transition: Lead, and standardize the operational playbook for all air-cooled Spark units to ensure up time and meet Cloud/Product team SLA requirements
CDU/DLC Ops: Lead the operational readiness for maintaining liquid-cooled infrastructure (CDUs, cooling loops, leak detection)
Technical Training: Standardize the training protocols for field technicians to handle next-gen Spark units and GPU clusters
Logistics & Safety SOPs: Build a standardized logistics plan for critical spares to ensure 4-hour response times at remote sites
Build a standardized staffing strategy for maintaining, operating and addressing SLA needs for 4-hour response times at remote sites
Implement rigorous OSHA and LOTO (Lockout-Tagout) protocols across all field work to ensure a zero-incident culture
Requirements:
10+ years of experience in data center operations or high-stakes field services
Hands-on expertise with industrial equipment (UPS, transformers) and a clear roadmap for maintaining liquid-cooled (DLC) systems
Experience managing "lights-out" or unmanned facilities across a broad geographic footprint
Ability to build a global sparing strategy
you know what needs to be in a "crash kit" to hit a 4-hour SLA
Expert knowledge of OSHA standards and LOTO protocols for high-voltage and mission-critical systems
A gift for writing clear, actionable SOPs and Emergency Operating Procedures (EOPs) that ensure "Mountaineer" level safety
What we offer:
Restricted Stock Units in a fast growing, well-funded technology company
Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
Employer contributions to HSA accounts
Paid Parental Leave
Paid life insurance, short-term and long-term disability