Ceph Infrastructure Engineer Job at Palantir Technologies (London)

Ceph Infrastructure Engineer

Join the Substrate Edge Team at Palantir, where we are responsible for mission-c...

Location

United States , New York

Salary:

135000.00 - 200000.00 USD / Year

Palantir Technologies

Expiration Date

Until further notice

Requirements

4+ years of software development experience focused on core infrastructure with an emphasis on operational excellence
2+ years of experience in system design or architecture, including reliability and scaling of new and existing systems
1+ year of being operationally responsible for production grade Ceph clusters
Bachelor’s degree in Computer Science or equivalent practical experience
Ceph & Rook Expertise: Practical, hands-on experience managing Ceph storage solutions, with a deep understanding of its architecture and operational nuances ideally using rook
Automation Proficiency: Strong skills in infrastructure automation tools such as Terraform, Kubernetes Operators, and with coding proficiency in Go, Java, or equivalent
Systems Programming: Experience in systems programming with proficiency in Go, Rust, C/C++, or equivalent languages
Hardware and OS Knowledge: Deep familiarity with hardware configurations, operating systems, and diagnostic tools
Networking Fundamentals: Solid understanding of networking principles, with experience in CNIs or cloud networking infrastructure preferred
On-premise datacenter experience: Experience working with on-premise hardware, or sysadmin/SRE in data centers

Job Responsibility

Manage Ceph at Scale: Design, deploy, and maintain Ceph storage solutions across diverse hardware environments, ensuring high availability and performance under challenging constraints
Automate Deployments: Develop and implement automation strategies for managing multiple Ceph deployments, reducing manual intervention and enhancing operational efficiency using world-class tooling
Innovate and Contribute: Drive the adoption of novel features and tools within the Ceph and CNCF ecosystems, contributing upstream as necessary to improve the broader community
Engage with Communities: Actively participate in the Ceph developer community and the CNCF, sharing insights and collaborating on open-source projects
Infrastructure Excellence: Collaborate with the team to design and build the next generation of Palantir’s infrastructure, focusing on systems that are scalable, stable, and secure

What we offer

Employees (and their eligible dependents) can enroll in medical, dental, and vision insurance as well as voluntary life insurance
Employees are automatically covered by Palantir’s basic life, AD&D and disability insurance
Commuter benefits
Take what you need paid time off, not accrual based
2 weeks paid time off built into the end of each year (subject to team and business needs)
10 paid holidays throughout the calendar year
Supportive leave of absence program including time off for military service and medical events
Paid leave for new parents and subsidized back-up care for all parents
Fertility and family building benefits including but not limited to adoption, surrogacy, and preservation
Stipend to help with expenses that come with a new child

Fulltime

Senior System Administrator II [Ceph Engineer]

This team is part of our Platform Engineering team in India. The purpose of Plat...

Location

India , Bengaluru

Salary:

Not provided

Adyen

Expiration Date

Until further notice

Requirements

5+ years of experience in Linux systems administration with a focus on storage (e.g., Ceph)
Proven experience with enterprise storage systems (e.g., Purestorage, NetApp, EMC, Dell, IBM, Hitachi)
Strong understanding of RAID, LVM, NFS, iSCSI, multipathing, and file systems (ext4, XFS, ZFS, etc.)
Hands-on experience with enterprise storage systems and clustering technologies
Proficiency in shell scripting
Python, Puppet or Ansible knowledge
Experience with virtualization technologies such as VMware or KVM
Familiarity with cloud storage solutions like AWS S3 and EBS
Experience with monitoring tools (Nagios, Prometheus, Zabbix, etc.)

Job Responsibility

Design and deploy storage infrastructure (SAN, NAS, DAS, Object Storage) in accordance with organizational needs
Monitor and maintain storage systems to guarantee optimal performance, availability, and reliability
Perform capacity planning and forecasting for future storage needs
Develop and implement strategies for backup, replication, and disaster recovery
Troubleshoot and resolve issues related to storage systems
Maintain documentation of configurations, procedures, and standards
Collaborate with other Platform Engineering teams on system upgrades, migrations, and integrations
Automate storage provisioning and management using scripting or infrastructure-as-code tools

System Administrator II [Ceph Engineer]

As a storage engineer of our Storage team in Bengaluru, you will be responsible ...

Location

India , Bengaluru

Salary:

Not provided

Adyen

Expiration Date

Until further notice

Requirements

3–5 years of experience in Linux systems administration with a focus on storage (e.g., Ceph) {MANDATE}
Proven experience with enterprise storage systems (e.g., Purestorage, NetApp, EMC, Dell, IBM, Hitachi). {MANDATE}
You have a strong understanding of RAID, LVM, NFS, iSCSI, multipathing, and file systems (ext4, XFS, ZFS, etc.) {MANDATE}
Hands-on experience with enterprise storage systems and clustering technologies
Proficiency in shell scripting
Python, Puppet or Ansible knowledge experience with virtualization technologies such as VMware or KVM
Familiarity with cloud storage solutions like AWS S3 and EBS
Experience with monitoring tools (Nagios, Prometheus, Zabbix, etc.)

Job Responsibility

Design and deploy storage infrastructure (SAN, NAS, DAS, Object Storage) in accordance with organizational needs
Monitor and maintain storage systems to guarantee optimal performance, availability, and reliability
Perform capacity planning and forecasting for future storage needs
Develop and implement strategies for backup, replication, and disaster recovery
Troubleshoot and resolve issues related to storage systems
Maintain documentation of configurations, procedures, and standards
Collaborate with other Platform Engineering teams on system upgrades, migrations, and integrations
Automate storage provisioning and management using scripting or infrastructure-as-code tools

Fulltime

Storage Engineer (Ceph)

As a storage engineer of our Storage team in Chicago, you will be responsible fo...

Location

United States , Chicago

Salary:

180000.00 - 243000.00 USD / Year

Adyen

Expiration Date

Until further notice

Requirements

3–5+ years of experience in Linux systems administration with a focus on storage (e.g., Ceph)
Proven experience with enterprise storage systems (e.g., Purestorage, NetApp, EMC, Dell, IBM, Hitachi)
You have a strong understanding of RAID, LVM, NFS, iSCSI, multipathing, and file systems (ext4, XFS, ZFS, etc.)
Hands-on experience with enterprise storage systems and clustering technologies
Proficiency in shell scripting
You have Python, Puppet or Ansible knowledge
Experience with virtualization technologies such as VMware or KVM
You have familiarity with cloud storage solutions like AWS S3 and EBS
Experience with monitoring tools (Nagios, Prometheus, Zabbix, etc.)

Job Responsibility

Design and deploy storage infrastructure (SAN, NAS, DAS, Object Storage) in accordance with organizational needs
Monitor and maintain storage systems to guarantee optimal performance, availability, and reliability
Perform capacity planning and forecasting for future storage needs
Develop and implement strategies for backup, replication, and disaster recovery
Troubleshoot and resolve issues related to storage systems
Maintain documentation of configurations, procedures, and standards
Collaborate with other Platform Engineering teams on system upgrades, migrations, and integrations
Automate storage provisioning and management using scripting or infrastructure-as-code tools.

Fulltime

Senior Infrastructure Engineer – Hosting

As a Senior Infrastructure Engineer – Hosting you will be responsible for the de...

Location

United States

Salary:

150000.00 USD / Year

Corporate Tools

Expiration Date

Until further notice

Requirements

3-5 years of experience in Linux system administration, virtualization, and cloud infrastructure
Experience with Proxmox or other hypervisors (VMware, KVM, Xen, Hyper-V)
Experience with Ceph or SAN storage solutions for virtualization
Ability to manage kernel tuning, system performance, and process optimization
Hands-on experience with Ceph storage, ZFS, iSCSI, NFS, RAID, and SAN architectures
Understanding of storage performance metrics (IOPS, throughput, latency)
Ability to work on projects solo or with a team
Love for learning and improving code
Strong communication and collaboration skills
Experience with WordPress hosting, database replication, and caching techniques

Job Responsibility

Develop and design robust and scalable hardware solutions
Take ownership of projects from conception to deployment, ensuring timely delivery and meeting the specified requirements
Work closely with cross-functional teams, including IT, product management, and other software teams, to ensure seamless integration and alignment with business objectives
Deploy, configure, and maintain Proxmox VE clusters for virtualization or other hypervisors
Implement high-availability (HA) and failover solutions for virtual machines
Manage resource allocation (CPU, memory, disk, network) to optimize performance for hosted applications
Automate VM deployment and configuration using Ansible, Terraform, or SaltStack
Maintain backups and disaster recovery plans for virtualized environments
Design and manage Ceph clusters or SAN storage (iSCSI, NFS, ZFS, etc.) for high-performance, redundant storage
Monitor and optimize storage performance, including IOPS, latency, and throughput

What we offer

100% employer-paid medical, dental and vision for employees
Annual review with raise option
22 days Paid Time Off accrued annually, and 4 holidays
After 3 years, PTO increases to 29 days. Employees transition to flexible time off after 5 years with the company—not accrued, not capped, take time off when you want
The 4 holidays are: New Year’s Day, Fourth of July, Thanksgiving, and Christmas Day
Paid Parental Leave
Up to 6% company matching 401(k) with no vesting period
Quarterly allowance
Use to make your remote work set up more comfortable, for continuing education classes, a plant for your desk, coffee for your coworker, a massage for yourself... really, whatever
Open concept office with friendly coworkers

Fulltime

Infrastructure Engineer

We are looking for an experienced Infrastructure Engineer to join a multi-discip...

Location

United Kingdom , Cheltenham; London; Ipswich

Salary:

Not provided

Plusnet

Expiration Date

Until further notice

Requirements

Infrastructure Architecture/Design
Infrastructure Configuration
Troubleshooting
Programming/Scripting
Performance Monitoring
Experience in: Operating Systems: Linux
Networking (Design, Configuration, Testing, Security): Dell Networking, Cisco ASA, Juniper SRX
Storage solutions: Ceph, iSCSI, Dell PowerVault
Virtualisation: VMware, Proxmox
Host provisioning: Foreman, PXE boot

Job Responsibility

Support the maintenance of existing IT infrastructure
System administration tasks
Automating repeated manual tasks
Troubleshooting issues
System health monitoring
Developing monitoring solutions to identify system inefficiencies
Patching vulnerabilities
Be hands on in the transformation/modernisation of existing infrastructure, or creation of new
Evaluating new technologies against requirements
Developing proof of concepts

What we offer

Competitive salary
BT Pension scheme, minimum 5% Employee contribution, BT contribution 10%
On-call allowance (Depending on job role requirements)
25 days annual leave (not including bank holidays), increasing with service
Huge range of flexible benefits including cycle to work, healthcare, season ticket loan
World-class training and development opportunities
From January 2025, equal family leave: receive 18 weeks at full pay, 8 weeks at half pay and 26 weeks at the statutory rate
Enhanced women’s health support: including help with menopause symptoms, cancer screenings, period care and more
24/7 private virtual GP appointments for UK colleagues
2 weeks paid carer’s leave

Fulltime

Senior Infrastructure Engineer

As a Senior Infrastructure Engineer in the IT department, you provide the critic...

Location

Finland , Helsinki

Salary:

Not provided

ICEYE

Expiration Date

Until further notice

Requirements

Core Technical Profile: The T-Shaped Engineer
Virtualization & Compute: Experience with VMware or Proxmox is essential
Familiarity with shared storage backends (SAN, iSCSI, Ceph) and disaster recovery planning
Hybrid Cloud (AWS): Understand VPC architecture (Transit Gateways, Peering), IAM security boundaries, and hybrid connectivity (VPN/Direct Connect)
Advanced Networking: Comfortable configuring L2/L3 switching, debugging routing issues, and managing firewall rulesets and VPN configurations
Experience with Palo Alto and Cisco
Infrastructure as Code (IaC): Experience with Terraform (state management, modules) and Ansible (playbook optimization)
OS Administration: Deep internal knowledge of Windows Server or Linux environments
Capabilities in performance tuning, kernel diagnostics, Active Directory integration, and automated patching strategies
Hybrid Requirement: A hybrid presence is required

Job Responsibility

Own, build, and evolve our core infrastructure to enable scalable global satellite operations
Own and modernize our corporate infrastructure
Managing the physical and virtual backbone
Building resilient on-premise clusters
Optimizing hybrid networks
Transitioning our systems administration into a disciplined, code-driven practice
Infrastructure Ownership: Assume responsibility for core segments, including on-prem clusters and network segments, identifying technical debt and executing a plan to address it
Build & Stabilize: Architect and deploy new on-premise clusters while simultaneously stabilizing legacy systems
Cloud Expansion: Lead the initiative to build new infrastructure on public cloud services, ensuring seamless integration with our existing on-premise footprint
Modernization: Drive the transition from manual administrative tasks to automated, reproducible workflows using Terraform and Ansible

What we offer

Occupational healthcare, occupational, and accident insurance
A yearly benefit budget to spend as you wish (i.e. on sport, transport, bike benefit, wellness, lunch, etc.)
Phone subscription with iPhone of choice
Relocation support (i.e. flight tickets, accommodation, relocation agency support)
Time for self-development, research, training, conferences, or certification schemes
Inspiring and collaborating offices and silent workspaces enable you to focus

Staff Engineer, Distributed Storage, HPC & AI Infrastructure

In this role, you will design and deliver multi-petabyte storage systems purpose...

Location

Netherlands , Amsterdam

Salary:

Not provided

Together AI

Expiration Date

Until further notice

Requirements

8+ years in storage engineering with 3+ years managing distributed storage at multi-petabyte scale
Proven track record deploying and operating high-performance storage for GPU/HPC clusters
Deep Kubernetes and cloud-native storage experience in production environments
Strong coding skills in Go and Python with demonstrated ability to build production-grade tools
BS/MS in Computer Science, Engineering, or equivalent practical experience
History of technical leadership: designing systems that significantly improved performance (>3x), reliability (99.9%+ uptime), or cost efficiency
Distributed Storage Systems: Deep expertise in WekaFS, Lustre, GPFS, BeeGFS, or similar parallel filesystems at multi-petabyte scale
Object Storage: Production experience with S3, MinIO, Ceph, or R2 including performance optimization and cost management
Kubernetes Storage: CSI drivers, StatefulSets, PersistentVolumes, storage operators, and custom controllers
Storage optimization for GPU workloads, RDMA/InfiniBand networking, parallel filesystem optimization (100+ GB/s aggregate cluster throughput)

Job Responsibility

Design multi-petabyte AI/ML storage systems
integrate WekaFS, Ceph, etc.
lead capacity planning and cost optimization (30-50% savings via tiering, lifecycle policies, right-sizing)
Design/optimize RDMA, InfiniBand, 400GbE networks
tune for max throughput/min latency
implement NVMe-oF/iSCSI
troubleshoot bottlenecks
optimize TCP/IP for storage
Build Kubernetes storage operators/controllers
enable automated provisioning, self-service abstractions, multi-tenant isolation, quotas

Select Country

Ceph Infrastructure Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Ceph Infrastructure Engineer

Ceph Infrastructure Engineer

Senior System Administrator II [Ceph Engineer]

System Administrator II [Ceph Engineer]

Storage Engineer (Ceph)

Senior Infrastructure Engineer – Hosting

Infrastructure Engineer

Senior Infrastructure Engineer

Staff Engineer, Distributed Storage, HPC & AI Infrastructure

Our AI answers in your language