CrawlJobs Logo
Briefcase Icon
Category Icon

Reliability Engineer I Jobs (Remote work)

87 Job Offers

Filters
Principal Site Reliability Engineer
Save Icon
Lead the evolution to AI-driven resilience as a Principal SRE at Groupon. Architect self-healing systems on GCP/AWS with Kubernetes and Terraform, leveraging AIOps for predictive reliability. This role in Colombia offers a chance to shape global platform stability with cutting-edge tech and signi...
Location Icon
Location
Colombia
Salary Icon
Salary
Not provided
groupon.com Logo
Groupon
Expiration Date
Until further notice
Principal Site Reliability Engineer
Save Icon
Lead the evolution to AI-driven resilience as a Principal SRE at Groupon. Architect self-healing systems on GCP/AWS with Kubernetes and Terraform, leveraging AIOps for predictive operations. This role in Ecuador offers a chance to shape global platform reliability with cutting-edge tech and signi...
Location Icon
Location
Ecuador
Salary Icon
Salary
Not provided
groupon.com Logo
Groupon
Expiration Date
Until further notice
Manager, Site Reliability Engineering and Incident Management
Save Icon
Lead our Site Reliability Engineering and Incident Management team in Atlanta. You will drive platform resilience, oversee critical incident response, and mentor a skilled team. This role requires deep cloud expertise and a passion for building reliable, scalable systems in a fast-paced SaaS envi...
Location Icon
Location
United States , Atlanta
Salary Icon
Salary
118000.00 - 160000.00 USD / Year
planetdds.com Logo
Planet DDS
Expiration Date
Until further notice
Database Reliability Engineer
Save Icon
Seeking a Database Reliability Engineer to manage and automate our mission-critical MySQL infrastructure in the cloud. You will apply your expertise in MySQL, cloud platforms (Azure preferred), and coding (Python, PowerShell) to ensure performance and reliability at scale. This role involves clos...
Location Icon
Location
United States
Salary Icon
Salary
120000.00 - 179000.00 USD / Year
pointclickcare.com Logo
PointClickCare
Expiration Date
Until further notice
Senior Site Reliability Engineer
Save Icon
Join Prolific as a Senior Site Reliability Engineer to ensure platform resilience and performance. You'll leverage your GCP, Kubernetes, and Terraform expertise to build scalable infrastructure in the UK. Champion SRE principles, enhance observability, and enjoy a remote role with competitive ben...
Location Icon
Location
United Kingdom
Salary Icon
Salary
Not provided
prolific.com Logo
Prolific
Expiration Date
Until further notice
Senior Site Reliability Engineer
Save Icon
Join our team as a Senior Site Reliability Engineer, focusing on our self-hosted product platform. You will architect and maintain containerized systems (Kubernetes, Docker) and ensure seamless customer deployments. This remote US role offers competitive salary, equity, and comprehensive benefits...
Location Icon
Location
United States
Salary Icon
Salary
200000.00 - 220000.00 USD / Year
tines.com Logo
Tines
Expiration Date
Until further notice
Site Reliability Engineering Manager
Save Icon
Lead a globally distributed SRE team at the Wikimedia Foundation, supporting infrastructure used by hundreds of millions. Utilize your hands-on expertise in cloud, Linux, Kubernetes, and IaC to guide critical projects and ensure reliability. This remote US role offers the chance to mentor enginee...
Location Icon
Location
United States of America
Salary Icon
Salary
132439.00 - 208378.00 USD / Year
wikimediafoundation.org Logo
Wikimedia Foundation
Expiration Date
Until further notice
Staff Site Reliability Engineer
Save Icon
Join Affirm in Spain as a Staff Site Reliability Engineer. You will define technical strategy and frameworks to ensure system reliability at scale using AWS, Kubernetes, and Python/Kotlin. This senior role requires 8+ years of backend and SRE experience, focusing on incident management and distri...
Location Icon
Location
Spain
Salary Icon
Salary
101000.00 - 131000.00 EUR / Year
affirm.com Logo
Affirm
Expiration Date
Until further notice
Staff Site Reliability Engineer
Save Icon
Lead our Site Reliability Engineering vision in Poland as a Staff SRE. You will design scalable backend systems using AWS, Kubernetes, and Python/Kotlin, while driving incident management and system resilience. This role offers major benefits like full health premium coverage and flexible lifesty...
Location Icon
Location
Poland
Salary Icon
Salary
358000.00 - 458000.00 PLN / Year
affirm.com Logo
Affirm
Expiration Date
Until further notice
Senior Site Reliability Engineer
Save Icon
Join Affirm in Poland as a Senior Site Reliability Engineer. You will design and operate highly available distributed systems using AWS, Kubernetes, and Python/Kotlin. Drive reliability frameworks, lead incident management, and support a global engineering team. Enjoy premium benefits, including ...
Location Icon
Location
Poland
Salary Icon
Salary
301000.00 - 401000.00 PLN / Year
affirm.com Logo
Affirm
Expiration Date
Until further notice
Senior Site Reliability Engineer
Save Icon
Join Affirm in Spain as a Senior Site Reliability Engineer. Design and launch scalable backend systems using Python, Kotlin, AWS, and Kubernetes. Drive reliability, incident management, and tooling for honest financial products. Enjoy comprehensive benefits, including full health coverage and fle...
Location Icon
Location
Spain
Salary Icon
Salary
85000.00 - 115000.00 EUR / Year
affirm.com Logo
Affirm
Expiration Date
Until further notice
Customer Reliability Engineer
Save Icon
Join Endor Labs as a Customer Reliability Engineer, the top-tier technical expert on our Customer Success team. You'll resolve complex, high-priority escalations using deep software engineering and DevOps expertise. This US-based role offers competitive benefits, flexible PTO, and a collaborative...
Location Icon
Location
United States
Salary Icon
Salary
Not provided
https://www.endorlabs.com Logo
Endor Labs
Expiration Date
Until further notice
Principal Site Reliability Engineer
Save Icon
Lead the CVML Platform team as a Principal SRE, architecting a secure, cost-effective hybrid infrastructure for robotics. Integrate edge devices, on-prem, and cloud (AWS, K8s) using Terraform, Python, and Go. Optimize performance and stability while collaborating cross-functionally in the autonom...
Location Icon
Location
United States
Salary Icon
Salary
166000.00 - 293000.00 USD / Year
bluerivertechnology.com Logo
Blue River Technology
Expiration Date
Until further notice
Senior Site Reliability Engineer
Save Icon
Join our team as a Senior Site Reliability Engineer in the United States. You will enhance system reliability through automation, CI/CD, and Azure cloud expertise. This role requires deep experience in scalable, distributed systems and observability practices. Drive incident resolution and influe...
Location Icon
Location
United States of America
Salary Icon
Salary
Not provided
vantagelinks.com Logo
VantageLinks
Expiration Date
Until further notice
Staff Site Reliability Engineer
Save Icon
Lead our infrastructure reliability strategy as a Staff Site Reliability Engineer. Architect large-scale, fault-tolerant AWS systems using Terraform and ECS expertise. Drive technical initiatives, mentor engineers, and tackle complex operational challenges. This remote US role offers a discretion...
Location Icon
Location
United States
Salary Icon
Salary
151040.00 - 188800.00 USD / Year
bugcrowd.com Logo
Bugcrowd
Expiration Date
Until further notice
Senior Site Reliability Engineer
Save Icon
Join our agile infrastructure team as a Senior Site Reliability Engineer. Design and maintain scalable AWS infrastructure using Terraform and ECS. You'll automate CI/CD, ensure system reliability, and collaborate in an international tech environment. This remote US role offers a bonus program for...
Location Icon
Location
United States
Salary Icon
Salary
129280.00 - 161600.00 USD / Year
bugcrowd.com Logo
Bugcrowd
Expiration Date
Until further notice
Senior Site Reliability Engineer - Data Pipeline
Save Icon
Join our Data Pipeline team as a Senior Site Reliability Engineer in Slovakia. You will build and maintain a robust GCP/Kubernetes ecosystem, ensuring high observability and scalability. We seek an expert in Terraform, HELM, and DevOps culture who values infrastructure as code. Enjoy a virtual-fi...
Location Icon
Location
Slovakia
Salary Icon
Salary
3500.00 EUR / Month
bloomreach.com Logo
Bloomreach
Expiration Date
Until further notice
Site Reliability Engineer
Save Icon
Join Luma AI to architect the physical and digital foundation of AGI. As a Site Reliability Engineer, you will build and optimize massive-scale, multi-vendor GPU supercomputers in Palo Alto or London. Your elite HPC knowledge will design high-performance clusters, optimizing low-level networking ...
Location Icon
Location
United States; United Kingdom , Palo Alto; London
Salary Icon
Salary
170000.00 - 360000.00 USD / Year
lumalabs.ai Logo
Luma AI
Expiration Date
Until further notice
Software Engineer - Reliability
Save Icon
Join Luma as a Software Engineer - Reliability in Palo Alto. Architect and scale next-gen AI infrastructure across AWS and OCI. Utilize your deep Linux and system performance expertise to ensure high availability for GPU clusters. Thrive in a fast-paced role solving complex hardware/software chal...
Location Icon
Location
United States , Palo Alto
Salary Icon
Salary
170000.00 - 360000.00 USD / Year
lumalabs.ai Logo
Luma AI
Expiration Date
Until further notice
Software Engineer - Reliability GPU Infrastructure
Save Icon
Shape the future of creative AI as a Software Engineer for GPU Infrastructure at Luma AI. You will architect and own our massive-scale, multi-cloud and on-premise compute substrate. This role requires deep expertise in distributed systems and infrastructure as code, based in Palo Alto or London.
Location Icon
Location
United States; United Kingdom , Palo Alto; London
Salary Icon
Salary
170000.00 - 360000.00 USD / Year
lumalabs.ai Logo
Luma AI
Expiration Date
Until further notice

About the Reliability Engineer I role

Explore Reliability Engineer I jobs and launch a career dedicated to ensuring system integrity and operational excellence. A Reliability Engineer I is an entry-level professional focused on proactively preventing failures, optimizing performance, and maximizing the uptime of critical systems. This foundational role exists across diverse industries, from technology and software to manufacturing, energy, and industrial operations. While the specific systems vary—encompassing software applications, cloud infrastructure, or physical machinery like rotating equipment—the core mission is universal: to build and maintain resilient, efficient, and dependable operations through engineering principles.

Individuals in these roles typically engage in a blend of monitoring, analysis, automation, and collaboration. Common responsibilities include assisting in the design and implementation of monitoring and alerting systems to gain visibility into system health. They analyze performance data, incident reports, and maintenance records to identify patterns and potential points of failure. A significant part of the role involves contributing to automation efforts, writing scripts to manage infrastructure, deploy applications, or streamline repetitive operational tasks, thereby reducing manual toil and human error. Reliability Engineers also participate in incident response, helping to diagnose and resolve issues, and contribute to post-incident reviews to document root causes and implement preventive measures. They often work closely with development and operations teams to advocate for reliability standards, such as scalable architecture and robust fault tolerance, throughout a system's lifecycle.

Typical skills and requirements for Reliability Engineer I positions include a strong foundational understanding of engineering concepts, often supported by a bachelor's degree in computer science, engineering, or a related technical field. Key technical proficiencies often include scripting or programming languages (like Python, Bash, or Shell), familiarity with operating systems, and an introductory knowledge of relevant domain tools. For software-centric roles, this might mean basic knowledge of cloud platforms (AWS, Azure, GCP), containerization (Docker, Kubernetes), and CI/CD pipelines. For industrial roles, understanding mechanical systems, statistical analysis, and reliability methodologies (like Root Cause Analysis or Failure Mode and Effects Analysis) is crucial. Regardless of the domain, successful candidates demonstrate a problem-solving mindset, a passion for automation, keen analytical abilities, and effective communication skills to collaborate across teams.

Pursuing Reliability Engineer I jobs is ideal for those who enjoy the intersection of development and operations, possess a meticulous attention to detail, and derive satisfaction from building systems that users and businesses can depend on every day. It is a career path built on continuous learning and offers a clear trajectory for growth into more senior engineering and specialist positions.

Filters

×
Countries
Category
Location
Work Mode
Salary