CrawlJobs Logo

Production Engineer, Storage

United States, San Francisco, Sunnyvale 166000.00 - 201000.00 USD / Year · Job Posted February 21, 2026
Apply Position
Job Link Share

Job Description

At Crusoe Energy Systems, our Site Reliability Engineering (SRE) team plays a mission-critical role in maintaining the performance and reliability of our AI-optimized cloud infrastructure. The Storage-focused SRE role is responsible for ensuring the availability, performance, and scalability of Crusoe’s cloud storage products and services, which power compute-intensive, latency-sensitive workloads for AI and HPC use cases. This role directly supports our vertically integrated, sustainable cloud platform by building and optimizing distributed, fault-tolerant storage systems at scale.

Job Responsibility

  • Build automation and self-healing tools to monitor and maintain Crusoe’s distributed cloud storage infrastructure
  • Drive reliability initiatives focused on data replication, encryption, backup and restore strategies, and robust failover mechanisms
  • Help implement and maintain high-performance NVMe- and SSD-backed volumes that support large-scale AI compute clusters
  • Support user-facing storage services with a focus on availability, performance tuning, and adherence to error budgets
  • Investigate and resolve storage-related incidents using deep telemetry, logs, and performance profiling
  • Partner with hardware and kernel teams to diagnose low-level I/O issues and optimize I/O paths, cache policies, and file systems
  • Contribute to the architecture of fault-tolerant, scalable storage backends tailored for AI-first cloud environments

Requirements

  • 5+ years of professional experience in SRE, systems, or storage engineering
  • Hands-on experience with distributed storage systems (e.g., Ceph, GlusterFS, OpenEBS) and deep understanding of object, block, and file storage paradigms
  • Proficiency in a programming language such as Python, Go, Java, or C
  • Experience with Infrastructure as Code and deployment tooling such as Terraform, Ansible, or Puppet
  • Deep knowledge of Linux internals with a focus on I/O subsystems, memory management, and storage scheduling
  • Familiarity with storage protocols like NFS, SMB, iSCSI, or NVMe-oF
  • Strong experience working with containerized workloads and orchestration platforms (e.g., Kubernetes, Docker)
  • Excellent incident response, troubleshooting, and documentation practices
  • Experience with building and operating managed services at scale such as object, file and block storage (AWS, GCP, Azure)
  • Excellent communication skills
  • Must be able to pass a background check
  • Embody the Company values

Nice to have

  • Contributions to open-source storage projects or the Linux storage stack
  • Experience with hybrid storage models across on-prem and cloud environments
  • Familiarity with high-throughput network topologies for storage backplanes (e.g., RoCE, RDMA, InfiniBand)

What we offer

  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Subscription to the Calm app
  • MetLife Legal
  • Company paid commuter benefit
  • $300 per month

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Production Engineer, Storage

8 matching positions

Principal Engineer - Data path - HPE Alletra Storage MP X10000 (Object Storage product development)

Develops organization-wide architectures and methodologies for software systems ...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in computer science, Information Systems, or equivalent
  • 15+ years of experience in a product development environment on storage/system engineering
  • Track record of delivering V1 products (or early-stage product development) in modern storage technologies (Object/File storage for modern AI use-cases, Object storage, cloud storage)
  • A track record of establishing and assuring adherence to performance requirements, work plans, and schedules for significant engineering initiatives
  • Experience designing and developing software systems design tools and languages
  • Experience in storage product development either file, block or object storage
  • Excellent analytical and problem-solving skills
  • Experience in overall architecture of software systems for products and solutions
  • Designing and integrating software systems running on multiple platform types into overall architecture
  • Evaluating and selecting forms and processes for software systems testing and methodology, including writing and execution of test plans, debugging, and testing scripts and tools
Job Responsibility
Job Responsibility
  • Develops organization-wide architectures and methodologies for software systems design and development across multiple platforms and organizations within the Global Business Unit
  • End-to-End Ownership and Technical Leadership
  • Identifies and evaluates new technologies, innovations, and outsourced development partner relationships for alignment with technology roadmap and business value
  • creates plans for integration and update into architecture
  • Anticipate bottlenecks and architect innovative solutions
  • Reviews and evaluates designs and project activities for compliance with development guidelines and standards
  • provides tangible feedback to improve product quality and mitigate failure risk
  • Drive best practices and operational excellence both at the team and organizational level
  • Coach and mentor junior and mid-level developers to help them grow technically and understand best practices
  • Leverages recognized domain expertise, business acumen, and experience to influence decisions of executive business leadership, outsourced development partners, and industry standards groups
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right
New

Distributed Storage Engineer - Vice President

Location
Location
India , Chennai
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 11 + Years of relevant experience in a Storage Engineering role or equivalent
  • Deep understanding of Block Storage technology such as SAN and/or Software-Defined storage
  • Strong Dell PowerFlex Storage experience or equivalent (i.e. IBM Ceph or similar) is desired
  • Extensive IP Storage Networking knowledge such as NVME-of-Fabric or iSCSI
  • Deep understand of storage technology such as RAID, Erasure Coding, Mirroring, Disaster Recovery, Replication, Snapshot and so on
  • Extensive knowledge of Linux, VMware and Windows to allow seamless integration of storage platform
  • Consistently demonstrates clear and concise written and verbal communication
  • Demonstrated analytic/diagnostic skills
  • Ability to work in a matrix environment and partner with virtual teams
  • Ability to work independently, prioritize, and take ownership of various parts of a project or initiative
Job Responsibility
Job Responsibility
  • Evaluate, Test, Design and Certify best of breed block storage solution to support critical financial applications
  • Provide engineering level support to the global storage operations team
  • Train and educate operations team members on various new storage solutions as necessary
  • Develop best practice and standards to ensure consistent production deployment models and support
  • Perform periodic security risk review and assessment of storage solutions to ensure compliance with Citi's security standards
  • Develop automation to streamline operational deployment and reduce/eliminate manual touchpoints
  • Advise or mentor junior team members
  • Impact the engineering function by influencing decisions through advice, counsel or facilitating services
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Senior .NET Engineer (Storage Infrastructure)

Our Client's team is looking for a self-motivated software engineer to join deve...
Location
Location
Salary
Salary:
Not provided
n-ix.com Logo
N-iX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of software engineering experience in high scale distributed systems
  • 8+ years of experience building resilient and highly available web services
  • Experience documenting architectural standards and decisions
  • Experience in full stack development
  • B.S., M.S., or PhD in Computer Science or equivalent experience
Job Responsibility
Job Responsibility
  • Design, develop, and maintain high-performance backend systems and APIs using C# and .NET technologies, hosted in azure and various compliance level data-centers
  • Leverage Azure services like Azure App Services, Azure Kubernetes Service (AKS), Azure Blob Storage, and SQL/No-SQL Databases to build scalable, secure, and reliable cloud-native solutions
  • Build and maintain microservices-based architectures using C#, ASP.NET, and others
  • Design and implement RESTful or gRPC APIs, ensure seamless integration with other systems and products
  • Optimize architecture and solution for scalability and availability with cost and maintenance in mind
  • Identify and address performance bottlenecks and scalability challenges proactively
  • Align across teams for designs, communicate and resolve roadblocks
  • Guide and mentor other engineers through design and code reviews
What we offer
What we offer
  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits
Read More
Arrow Right

Production Engineer

Meta Platforms, Inc. (Meta), formerly known as Facebook Inc., builds technologie...
Location
Location
United States , Bellevue
Salary
Salary:
134721.00 - 165000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree (or foreign equivalent) in Computer Science, Engineering, Information Systems, Analytics, Mathematics, Physics, Applied Sciences or a related field
  • UNIX or Linux operating system fundamentals
  • TCP/IP network fundamentals
  • Coding in at least one of the following higher-level programming languages: PHP, Python, C++, or Java
  • Software frameworks and APIs
  • Internet service architectures (such as load balancing, LAMP, or CDN's)
  • Configuring and maintaining applications using at least one of the following: web servers, load balancers, relational databases, storage systems, or messaging systems
  • Relational Databases including MySQL
  • Network protocols including at least one of the following: NFS, DHCP, NTP, SSH, DNS, or SNMP
  • Network Management tools like DHCP, NTP, SSH, DNS, or SNMP
Job Responsibility
Job Responsibility
  • Develop, design, create, modify, and/or test software services to ensure optimal performance and capacity for growth
  • Own back-end data warehouse services, front-end services like Messenger and Newsfeed, and infrastructure components to ensure services run without incident
  • Write and review code, develop documentation and capacity plans, and debug the problems in real time in highly complex software systems
  • Serve an escalation contact for service incidents
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Production Engineer

Meta Platforms, Inc. (Meta), formerly known as Facebook Inc., builds technologie...
Location
Location
United States , Menlo Park
Salary
Salary:
161637.00 - 165000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Requires a Master's degree (or foreign degree equivalent) in Computer Science, Engineering, Information Systems, Analytics, Mathematics, Physics, Applied Sciences, or a related field
  • Requires completion of at least one university-level course, research project, thesis, or internship including the following: UNIX or Linux operating system fundamentals
  • TCP/IP network fundamentals
  • Coding in at least one of the following higher-level programming languages: PHP, Python, C++, or Java
  • Software frameworks and APIs
  • Performing 'guerilla capacity planning' for internet service architectures
  • Internet service architectures (such as load balancing, LAMP, or CDN’s)
  • Configuring and maintaining applications using at least one of the following: web servers, load balancers, relational databases, storage systems, or messaging systems
  • Relational Databases including MySQL
  • Network protocols including at least one of the following: NFS, DHCP, NTP, SSH, DNS, or SNMP
Job Responsibility
Job Responsibility
  • Develop, design, create, modify, and/or test software services to ensure optimal performance and capacity for growth
  • Own back-end data warehouse services, front-end services like Messenger and Newsfeed, and infrastructure components to ensure services run without incident
  • Write and review code, develop documentation and capacity plans, and debug the problems in real time in highly complex software systems
  • Serve as an escalation contact for service incidents
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Lead SMT Production Engineer

This is an exciting opportunity to join a leading electronic systems and manufac...
Location
Location
United Kingdom , Merseyside
Salary
Salary:
Not provided
zenovo.co.uk Logo
Zenovo
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 2 years’ experience in an SMT Production role
  • Proven electromechanical skills applicable to product development
  • Proficient in PCB and electronics
  • Experience setting up new design stencil machines, pick and place machines, reflow ovens, and AOI testing machines
  • Ability to communicate across multi-disciplinary teams.
Job Responsibility
Job Responsibility
  • Setting up and operating design stencil machines, pick-and-place machines, reflow ovens, and AOI testing machines
  • Troubleshooting and resolving manufacturing issues to ensure smooth production flow
  • Collaborating closely with manufacturers and final assembly teams to maintain quality and efficiency
  • Manage component purchase and storage whilst adhering to specifications
  • Ensure conventional component and manual soldering procedures comply with IPC-A-610 standards.
  • Fulltime
Read More
Arrow Right

Data Engineer (Production Support) for AWS EMR

We are seeking a highly skilled and motivated Data Engineer specializing in Prod...
Location
Location
China , Shangai
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in managing AWS services, particularly EMR, S3, Lambda, Step Functions, and CloudWatch
  • Hands-on experience with distributed data processing frameworks like Apache Spark, Hive, or Presto
  • Experience on Kafka, NiFi, Amazon Web Service (AWS), Maven, Ambari-TEZ, Stash and Bamboo
  • Familiarity with data loading tools like Talend, Sqoop
  • Familiarity with cloud database like AWS Redshift, Aurora MySQL and PostgreSQL
  • Knowledge of workflow/schedulers like Oozie or Apache AirFlow
  • Strong knowledge of Shell Scripting, python or Java for scripting and automation
  • Familiarity with SQL and query optimization techniques
  • Experience in production support for large-scale distributed systems or data platforms
  • Ability to analyze logs, diagnose issues, and implement fixes in high-pressure scenarios
Job Responsibility
Job Responsibility
  • Monitor, troubleshoot, and resolve issues in real-time for AWS EMR clusters and associated data pipelines
  • Investigate and debug data processing failures, latency issues, and performance bottlenecks
  • Provide support for mission-critical production systems as part of an on-call rotation
  • Manage AWS EMR cluster lifecycle, including creation, scaling, termination, and optimization
  • Ensure effective resource utilization and cost optimization of clusters
  • Apply patches and upgrades to EMR clusters and software components as needed
  • Maintain and support ETL/ELT pipelines built on tools such as Apache Spark, Hive, or Presto running on EMR
  • Ensure data quality, consistency, and availability across pipelines and storage systems like S3, Redshift, Mysql or Snowflake
  • Implement and monitor automated workflows using AWS tools like Step Functions, Lambda, and CloudWatch
  • Analyze and optimize EMR job performance by tuning Spark/Hive configurations and improving query efficiency
  • Fulltime
Read More
Arrow Right

Production Engineer

Meta Platforms, Inc. (Meta), formerly known as Facebook Inc., builds technologie...
Location
Location
United States , Bellevue
Salary
Salary:
179519.00 - 209000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master's degree (or foreign equivalent) in Computer Science, Engineering, Information Systems, Analytics, Mathematics, Physics, Applied Sciences or a related field and 1 year of experience in the job offered or in a computer-related occupation
  • Experience must include 1 year in the following: UNIX or Linux operating system fundamentals
  • TCP/IP network fundamentals
  • Coding in at least one of the following higher-level programming languages: PHP, Python, C++, or Java
  • Software frameworks and APIs
  • Performing 'guerilla capacity planning' for internet service architectures
  • Internet service architectures (such as load balancing, LAMP, or CDN’s)
  • Configuring and maintaining applications using at least one of the following: web servers, load balancers, relational databases, storage systems, or messaging systems
  • Relational Databases including MySQL
  • Network protocols including at least one of the following: NFS, DHCP, NTP, SSH, DNS, or SNMP
Job Responsibility
Job Responsibility
  • Develop, design, create, modify, and/or test software services to ensure optimal performance and capacity for growth
  • Own back-end data warehouse services, front-end services like Messenger and Newsfeed, and infrastructure components to ensure services run without incident
  • Write and review code, develop documentation and capacity plans, and debug the problems in real time in highly complex software systems
  • Serve an escalation contact for service incidents
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right