CrawlJobs Logo

Storage Engineer (Ceph)

adyen.com Logo

Adyen

Location Icon

Location:
United States , Chicago

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

180000.00 - 243000.00 USD / Year

Job Description:

As a storage engineer of our Storage team in Chicago, you will be responsible for designing, implementing, managing, and maintaining data storage systems and solutions.This role involves the end-to-end management of our organization's data storage infrastructure. This includes strategic design for scalability, hands-on implementation and integration, continuous performance monitoring, capacity planning, security, and troubleshooting. It also requires proactive maintenance with updates, backups, disaster recovery, and planning for future growth by evaluating and recommending new technologies for data availability, reliability, and efficiency.

Job Responsibility:

  • Design and deploy storage infrastructure (SAN, NAS, DAS, Object Storage) in accordance with organizational needs
  • Monitor and maintain storage systems to guarantee optimal performance, availability, and reliability
  • Perform capacity planning and forecasting for future storage needs
  • Develop and implement strategies for backup, replication, and disaster recovery
  • Troubleshoot and resolve issues related to storage systems
  • Maintain documentation of configurations, procedures, and standards
  • Collaborate with other Platform Engineering teams on system upgrades, migrations, and integrations
  • Automate storage provisioning and management using scripting or infrastructure-as-code tools.

Requirements:

  • 3–5+ years of experience in Linux systems administration with a focus on storage (e.g., Ceph)
  • Proven experience with enterprise storage systems (e.g., Purestorage, NetApp, EMC, Dell, IBM, Hitachi)
  • You have a strong understanding of RAID, LVM, NFS, iSCSI, multipathing, and file systems (ext4, XFS, ZFS, etc.)
  • Hands-on experience with enterprise storage systems and clustering technologies
  • Proficiency in shell scripting
  • You have Python, Puppet or Ansible knowledge
  • Experience with virtualization technologies such as VMware or KVM
  • You have familiarity with cloud storage solutions like AWS S3 and EBS
  • Experience with monitoring tools (Nagios, Prometheus, Zabbix, etc.)

Additional Information:

Job Posted:
May 13, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Storage Engineer (Ceph)

Software Engineer, Storage

At evroc, we are building a secure, sovereign, and sustainable hyperscale cloud ...
Location
Location
Sweden , Stockholm
Salary
Salary:
Not provided
evroc.com Logo
Evroc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in distributed systems and Linux systems engineering
  • Coding in programming languages such as C, C++, Golang or Rust
  • Typically 5+ years of experience in building and enhancing clustered storage solutions such as Ceph, Gluster, HDFS, Lustre or MinIO or cloud storage such as AWS S3, EBS, GCP’s Blob/Block/File Storage
  • Extensive experience within the Kubernetes ecosystem developing and running highly available services
  • Deep understanding of databases, file systems and storage protocols
  • Knowledge of performance tuning and optimization techniques
  • Active engagement or contributions to the open-source community
  • Collaborative, curious, and pragmatic Software Engineer who wants to be part of an innovative team
  • Applicants must possess a valid work permit
Job Responsibility
Job Responsibility
  • Design, develop and maintain evroc’s cloud storage offerings including S3 object storage, block storage and shared file systems
  • Build components using first principles from the ground up to unlock optimization opportunities at every layer of the stack
  • Optimize existing cloud storage solutions for performance ensuring low latency and high throughput, high reliability and cost effectiveness
  • Work with appropriate stakeholders to determine user requirements
  • Leverage a variety of feedback channels to incorporate insights into future designs or solution fixes
  • Seamlessly integrate and upkeep open-source components within our evolving tech stack
  • Team up with fellow engineers to craft tailored solutions meeting our unique challenges
  • Lead the charge in defining and achieving our technical benchmarks
What we offer
What we offer
  • We offer a competitive salary and an equity package to attract the best
  • Fulltime
Read More
Arrow Right

Sr. Network Data Center Engineer

If you live and breathe networking, virtualization, and high-availability system...
Location
Location
United States
Salary
Salary:
150000.00 USD / Year
corporatetools.com Logo
Corporate Tools
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with Proxmox or other hypervisors (VMware, KVM, Xen, Hyper-V)
  • 5+ years of network engineering, data center operations, or cloud infrastructure
  • Experience with Ceph or SAN-based storage solutions (iSCSI, NFS, ZFS)
  • Experience with containers and networking
  • Excellent problem-solving skills and a keen eye for detail
  • Ability to work on projects solo or with a team
  • Love for learning and improving code
  • Strong communication and collaboration skills
  • Understanding of Ceph storage architecture (OSDs, MONs, MDS, RADOS, etc.)
  • Experience in iSCSI/NFS/ZFS SAN setups and performance tuning
Job Responsibility
Job Responsibility
  • Develop and design robust and scalable software solutions
  • Take ownership of projects from conception to deployment, ensuring timely delivery and meeting the specified requirements
  • Work closely with cross-functional teams, including IT, product management, and other software teams, to ensure seamless integration and alignment with business objectives
  • Stay updated with the latest industry trends, technologies, and best practices to bring innovative solutions to the table
  • Design, implement, and maintain a robust network architecture that supports Proxmox virtualization, Ceph/SAN storage, and container networking
  • Manage firewalls (iptables, pfSense, UFW, etc.) to secure access to virtualized environments and hosting services
  • Configure and optimize VLANs, subnets, and routing to ensure isolated and secure network segments for virtual machines, storage, and frontend applications
  • Configure and maintain VPNs, BGP, OSPF, or other routing protocols to ensure proper network redundancy and failover
  • Set up and maintain bridged, NAT, and VXLAN networking in Proxmox for efficient VM communication
  • Implement high-availability (HA) networking for Hypervisor networks and Ceph/SAN clusters
What we offer
What we offer
  • 100% employer-paid medical, dental and vision for employees
  • Annual review with raise option
  • 22 days Paid Time Off accrued annually, and 4 holidays
  • After 3 years, PTO increases to 29 days. Employees transition to flexible time off after 5 years with the company—not accrued, not capped, take time off when you want
  • The 4 holidays are: New Year’s Day, Fourth of July, Thanksgiving, and Christmas Day
  • Paid Parental Leave
  • Up to 6% company matching 401(k) with no vesting period
  • Quarterly allowance
  • Use to make your remote work set up more comfortable, for continuing education classes, a plant for your desk, coffee for your coworker, a massage for yourself... really, whatever
  • Open concept office with friendly coworkers
  • Fulltime
Read More
Arrow Right

Storage Engineering Manager

We are seeking a seasoned Storage Engineering Manager with experience in the spe...
Location
Location
United States , San Jose; San Francisco; Bellevue
Salary
Salary:
297000.00 - 495000.00 USD / Year
lambda.ai Logo
Lambda
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in storage engineering with at least 5+ years in a management or lead role
  • Demonstrated experience leading a team of storage engineers and storage SREs on complex, cross-functional projects in a fast-paced startup environment
  • Extensive hands-on experience in designing, deploying, and maintaining distributed storage solutions in a CSP (Cloud Service Provider), NCP (Neo-Cloud provider), HPC-infrastructure integrator, or AI-infrastructure company
  • Experience with storage solutions serving storage volumes at a scale greater than 20PB
  • Strong project management skills, leading high-confidence planning, project execution, and delivery of team outcomes on schedule
  • Extensive experience with storage site reliability engineering
  • Experience with one or more of the following in an HPC or AI Infrastructure environment: Vast, DDN, Pure Storage, NetApp, Weka
  • Experience deploying CEPH at scale greater than 25PB
  • Experience in serving one or more of the following storage protocols: object storage (e.g., S3), block storage (e.g., iSCSI), or file storage (e.g., NFS, SMB, Lustre)
  • Professional individual contributor experience as a storage engineer or storage SRE
Job Responsibility
Job Responsibility
  • Grow/Hire, lead, and mentor a top-talent team of high-performing storage engineers delivering HPC, petabyte-scale storage solutions
  • Foster a high-velocity culture of innovation, technical excellence, and collaboration
  • Conduct regular one-on-one meetings, provide constructive feedback, and support career development for team members
  • Drive outcomes by managing project priorities, deadlines, and deliverables using Agile methodologies
  • Drive the technical vision and strategy for Lambda distributed storage solutions
  • Lead storage vendor selection criteria, vendor selection, and vendor relationship management (support, installation, scheduling, specification, procurement)
  • Manage team in storage lifecycle management (installation, cabling, capacity upgrades, service, RMA, updating both hardware and software components as needed)
  • Guide choices around optimization of storage pools, sharding, and tiering/caching strategies
  • Lead team in tasks related to multi-tenant security, tenant provisioning, metering integration, storage protocol interconnection, and customer data-migration
  • Guide Storage SREs in development of scripting and automation tools for configuration management, monitoring, and operational tasks
What we offer
What we offer
  • Generous cash & equity compensation
  • Health, dental, and vision coverage for you and your dependents
  • Wellness and commuter stipends for select roles
  • 401k Plan with 2% company match (USA employees)
  • Flexible paid time off plan that we all actually use
  • Fulltime
Read More
Arrow Right

Senior Staff Software Engineer, Storage

As a Senior Staff Software Engineer on the Cloud Storage team, you will lead the...
Location
Location
United States , San Francisco; Sunnyvale
Salary
Salary:
245000.00 - 290000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • System Programming Expertise: Proven experience in system programming with languages such as C, C++, and/or Rust
  • Linux Systems Knowledge: Extensive knowledge of Linux Systems Internals and computer architecture
  • Cloud Storage Design & Development: Ability to design, develop, and deploy highly scalable and distributed cloud storage solutions
  • Storage Engineering Fundamentals: Strong understanding of storage engineering concepts, including data protection mechanisms (e.g., redundancy, replication, encryption), fault tolerance, and storage technologies (e.g., NVMe, SSDs)
  • Storage Technologies: In-depth understanding of at least one of the following: block storage, object storage, and/or file storage
  • Storage Protocols: Familiarity with industry-standard storage protocols such as NFS, SMB, iSCSI, and NVMe-oF
  • Software Engineering Best Practices: Expertise in professional software engineering practices, including coding standards, code reviews, source control management, build processes, testing, and operations
  • Open Source Contributions: Demonstrated track record of contributions to the open source community (e.g., Ceph, GlusterFS, OpenEBS)
  • Communication & Collaboration: Excellent communication and collaboration skills, with the ability to effectively communicate technical concepts to both technical and non-technical audiences
Job Responsibility
Job Responsibility
  • Lead Storage Strategy Development and Execution: Define and execute the roadmap for the Crusoe Cloud storage strategy, aligning with overall business objectives
  • Lead Engineering Team: Serve as the engineering lead for the Cloud Storage team, collaborating with technology and engineering leadership to define and implement long-term strategic goals
  • Guide Engineering Practices: Provide technical leadership and guidance to the engineering team throughout the entire software development lifecycle, including architecture decisions, design reviews, code reviews, implementation tasks, and production support
  • Develop and Optimize Storage Infrastructure: Collaborate closely with the infrastructure organization to design, develop, and optimize industry-leading storage infrastructure solutions
  • Lead File System Development: Lead the development and maintenance of high-performance and reliable file systems, ensuring optimal performance and data integrity
  • Storage Architecture Design: Design and implement robust and scalable storage architectures, considering factors such as performance, reliability, availability, and cost-effectiveness
  • Cross-functional Collaboration: Foster strong collaboration with other teams across the organization, including infrastructure, software engineering, and product development
What we offer
What we offer
  • Industry competitive pay
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Staff Site Reliability Engineer, Storage

At Crusoe Energy Systems, our SRE team plays a mission-critical role in maintain...
Location
Location
United States , San Francisco, Sunnyvale
Salary
Salary:
204000.00 - 247000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of professional experience in Storage SRE, systems engineering, storage engineering, or similar roles
  • Hands-on experience with distributed storage systems (e.g., Ceph, GlusterFS, OpenEBS) and deep understanding of object, block, and file storage paradigms
  • Proficiency in a programming language such as, Go, Python, Java, or C
  • Experience with Infrastructure as Code and deployment tooling such as Terraform, Ansible, or Puppet
  • Deep knowledge of Linux internals with a focus on I/O subsystems, memory management, and storage scheduling
  • Familiarity with storage protocols like NFS, SMB, iSCSI, or NVMe-oF
  • Strong experience working with containerized workloads and orchestration platforms (e.g., Kubernetes, Docker)
  • Excellent incident response, troubleshooting, and documentation practices
  • Experience with building and operating managed services at scale such as object, file and block storage (AWS, GCP, Azure)
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Build automation and self-healing tools to monitor and maintain Crusoe’s distributed cloud storage infrastructure, which includes block, file, and object storage systems
  • Drive reliability initiatives focused on data replication, encryption, backup and restore strategies, and robust failover mechanisms
  • Help implement and maintain high-performance NVMe- and SSD-backed volumes that support large-scale AI compute clusters
  • Support user-facing storage services with a focus on availability, performance tuning, and adherence to error budgets
  • Investigate and resolve storage-related incidents using deep telemetry, logs, and performance profiling
  • Partner with hardware and kernel teams to diagnose low-level I/O issues and optimize I/O paths, cache policies, and file systems
  • Contribute to the architecture of fault-tolerant, scalable storage backends tailored for AI-first cloud environments
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Production Engineer, Storage

At Crusoe Energy Systems, our Site Reliability Engineering (SRE) team plays a mi...
Location
Location
United States , San Francisco, Sunnyvale
Salary
Salary:
166000.00 - 201000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience in SRE, systems, or storage engineering
  • Hands-on experience with distributed storage systems (e.g., Ceph, GlusterFS, OpenEBS) and deep understanding of object, block, and file storage paradigms
  • Proficiency in a programming language such as Python, Go, Java, or C
  • Experience with Infrastructure as Code and deployment tooling such as Terraform, Ansible, or Puppet
  • Deep knowledge of Linux internals with a focus on I/O subsystems, memory management, and storage scheduling
  • Familiarity with storage protocols like NFS, SMB, iSCSI, or NVMe-oF
  • Strong experience working with containerized workloads and orchestration platforms (e.g., Kubernetes, Docker)
  • Excellent incident response, troubleshooting, and documentation practices
  • Experience with building and operating managed services at scale such as object, file and block storage (AWS, GCP, Azure)
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Build automation and self-healing tools to monitor and maintain Crusoe’s distributed cloud storage infrastructure
  • Drive reliability initiatives focused on data replication, encryption, backup and restore strategies, and robust failover mechanisms
  • Help implement and maintain high-performance NVMe- and SSD-backed volumes that support large-scale AI compute clusters
  • Support user-facing storage services with a focus on availability, performance tuning, and adherence to error budgets
  • Investigate and resolve storage-related incidents using deep telemetry, logs, and performance profiling
  • Partner with hardware and kernel teams to diagnose low-level I/O issues and optimize I/O paths, cache policies, and file systems
  • Contribute to the architecture of fault-tolerant, scalable storage backends tailored for AI-first cloud environments
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Senior Site Reliability Engineer, Storage

At Crusoe Energy Systems, our Site Reliability Engineering (SRE) team plays a mi...
Location
Location
United States , San Francisco, Sunnyvale
Salary
Salary:
166000.00 - 201000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience in SRE, systems, or storage engineering
  • Hands-on experience with distributed storage systems (e.g., Ceph, GlusterFS, OpenEBS) and deep understanding of object, block, and file storage paradigms
  • Proficiency in a programming language such as Python, Go, Java, or C
  • Experience with Infrastructure as Code and deployment tooling such as Terraform, Ansible, or Puppet
  • Deep knowledge of Linux internals with a focus on I/O subsystems, memory management, and storage scheduling
  • Familiarity with storage protocols like NFS, SMB, iSCSI, or NVMe-oF
  • Strong experience working with containerized workloads and orchestration platforms (e.g., Kubernetes, Docker)
  • Excellent incident response, troubleshooting, and documentation practices
  • Experience with building and operating managed services at scale such as object, file and block storage (AWS, GCP, Azure)
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Build automation and self-healing tools to monitor and maintain Crusoe’s distributed cloud storage infrastructure
  • Drive reliability initiatives focused on data replication, encryption, backup and restore strategies, and robust failover mechanisms
  • Help implement and maintain high-performance NVMe- and SSD-backed volumes that support large-scale AI compute clusters
  • Support user-facing storage services with a focus on availability, performance tuning, and adherence to error budgets
  • Investigate and resolve storage-related incidents using deep telemetry, logs, and performance profiling
  • Partner with hardware and kernel teams to diagnose low-level I/O issues and optimize I/O paths, cache policies, and file systems
  • Contribute to the architecture of fault-tolerant, scalable storage backends tailored for AI-first cloud environments
What we offer
What we offer
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Staff Engineer, Distributed Storage, HPC & AI Infrastructure

In this role, you will design and deliver multi-petabyte storage systems purpose...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
together.ai Logo
Together AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in storage engineering with 3+ years managing distributed storage at multi-petabyte scale
  • Proven track record deploying and operating high-performance storage for GPU/HPC clusters
  • Deep Kubernetes and cloud-native storage experience in production environments
  • Strong coding skills in Go and Python with demonstrated ability to build production-grade tools
  • BS/MS in Computer Science, Engineering, or equivalent practical experience
  • History of technical leadership: designing systems that significantly improved performance (>3x), reliability (99.9%+ uptime), or cost efficiency
  • Distributed Storage Systems: Deep expertise in WekaFS, Lustre, GPFS, BeeGFS, or similar parallel filesystems at multi-petabyte scale
  • Object Storage: Production experience with S3, MinIO, Ceph, or R2 including performance optimization and cost management
  • Kubernetes Storage: CSI drivers, StatefulSets, PersistentVolumes, storage operators, and custom controllers
  • Storage optimization for GPU workloads, RDMA/InfiniBand networking, parallel filesystem optimization (100+ GB/s aggregate cluster throughput)
Job Responsibility
Job Responsibility
  • Design multi-petabyte AI/ML storage systems
  • integrate WekaFS, Ceph, etc.
  • lead capacity planning and cost optimization (30-50% savings via tiering, lifecycle policies, right-sizing)
  • Design/optimize RDMA, InfiniBand, 400GbE networks
  • tune for max throughput/min latency
  • implement NVMe-oF/iSCSI
  • troubleshoot bottlenecks
  • optimize TCP/IP for storage
  • Build Kubernetes storage operators/controllers
  • enable automated provisioning, self-service abstractions, multi-tenant isolation, quotas
Read More
Arrow Right