CrawlJobs Logo
Briefcase Icon
Category Icon

Site Reliability Engineering (SRE) Jobs (On-site work)

3 Job Offers

Filters
Site Reliability Engineering (SRE)
Save Icon
Join Fyld as a Site Reliability Engineer (SRE) in Lisbon or Porto. You will apply your expertise in Linux/Windows, Python/Go, and IaC tools like Terraform. This role requires a strong DevOps background and knowledge of networking and monitoring systems. We offer a performance-driven culture focus...
Location Icon
Location
Portugal , Lisboa; Porto
Salary Icon
Salary
Not provided
https://www.fyld.pt Logo
Fyld
Expiration Date
Until further notice
Site Reliability Engineer SRE – ML platform
Save Icon
Join our team in Sunnyvale as a Site Reliability Engineer for our ML platform. You will design AWS cloud solutions and build MLOps pipelines using Kubernetes, Docker, and Python. This role involves deploying ML models and collaborating with data scientists, requiring strong expertise in Kubeflow,...
Location Icon
Location
United States , Sunnyvale
Salary Icon
Salary
Not provided
thirdeyedata.ai Logo
Thirdeye Data
Expiration Date
Until further notice
Executive Principal, Site Reliability Engineering (SRE) – DevOps
Save Icon
Lead our Site Reliability Engineering (SRE) and DevOps strategy as an Executive Principal in Irvine. You will guide multiple infrastructure teams, ensuring 24x7 operational excellence in a complex hybrid environment. This senior role requires deep expertise in automation, CI/CD, and platform reli...
Location Icon
Location
United States , Irvine
Salary Icon
Salary
180000.00 - 210000.00 USD / Year
haeaus.com Logo
Hyundai AutoEver America
Expiration Date
Until further notice
Explore the dynamic and critical field of Site Reliability Engineering (SRE) jobs, where software engineering meets operations to build scalable, reliable, and efficient systems. SRE is a discipline that applies a software engineering mindset to infrastructure and operations problems. Professionals in this role, known as Site Reliability Engineers, are the bridge between development and IT operations, ensuring that services are highly available, performant, and resilient. Their core mission is to systematically eliminate manual operational work through automation while maintaining a focus on the end-user experience and system health. The typical responsibilities of an SRE are multifaceted. A primary duty is ensuring service reliability and availability, often measured against Service Level Objectives (SLOs) and managing error budgets. This involves designing, building, and maintaining monitoring, alerting, and observability platforms to gain deep insights into system behavior. SREs proactively work on capacity planning, performance tuning, and disaster recovery strategies. A significant portion of their work is dedicated to automation, creating software to automate repetitive tasks, manage infrastructure as code, and streamline deployment pipelines. When incidents occur, SREs lead the response, conducting thorough post-mortems and root cause analysis to implement permanent fixes and prevent future outages. They also collaborate closely with development teams to advocate for reliability best practices from the initial design phase, often by developing tools and frameworks that improve the entire software development lifecycle. To succeed in SRE jobs, a specific blend of skills is required. A strong software engineering background is fundamental, with proficiency in programming languages like Python, Go, or Java, and scripting in Bash or PowerShell. Deep knowledge of modern infrastructure is essential, including expertise in cloud platforms (AWS, GCP, Azure), containerization with Docker, and orchestration with Kubernetes. Experience with Infrastructure as Code tools like Terraform, Ansible, or Puppet is standard. SREs must have a solid grasp of networking, operating systems (Linux/Unix), and database management. Equally important are the analytical and problem-solving skills to diagnose complex distributed systems issues. Familiarity with the full CI/CD pipeline and a commitment to DevOps culture of collaboration and shared responsibility are crucial. Soft skills such as effective communication, a proactive mindset, and a focus on blameless post-mortems are highly valued in this collaborative, high-stakes field. For those passionate about building robust systems, solving intricate puzzles, and writing code to automate infrastructure, Site Reliability Engineering offers a challenging and rewarding career path. SRE jobs are at the heart of modern digital enterprises, making them crucial roles for anyone looking to impact product stability and user satisfaction directly. Discover your next opportunity in this essential tech discipline.

Filters

×
Countries
Category
Location
Work Mode
Salary