CrawlJobs Logo
Briefcase Icon
Category Icon

Filters

×
Filters

No filters available for this job position.

Expert Site Reliability Engineer Jobs

Filters

No job offers found for the selected criteria.

Previous job offers may have expired. Please check back later or try different search criteria.

Looking for Expert Site Reliability Engineer jobs? You are exploring a critical and high-impact role at the intersection of software engineering and systems operations. An Expert Site Reliability Engineer (SRE) is a seasoned professional who architects, builds, and maintains highly scalable, reliable, and efficient systems. The core mission is to ensure that services are always available, performant, and capable of meeting user demand, blending software development practices with operational rigor to solve problems at their root. In this senior capacity, typical responsibilities extend far beyond routine maintenance. Expert SREs provide strategic technical guidance and are often the primary technical liaison between development teams, operations, and business stakeholders. They design robust system architectures, create comprehensive documentation, and develop automation to eliminate manual toil. A key focus is on designing for reliability from the ground up, which includes defining Service Level Objectives (SLOs) and implementing error budgets. They proactively identify potential system risks, devise mitigation strategies, and lead incident response and post-mortem analyses to foster a culture of continuous improvement. Furthermore, they play a pivotal role in refining internal processes, establishing best practices, and mentoring other engineers. The typical skill set for these jobs is extensive. It requires deep expertise in coding and scripting (e.g., Python, Go, Java) to automate infrastructure and responses. Profound knowledge of cloud platforms (AWS, GCP, Azure), containerization (Docker, Kubernetes), and infrastructure-as-code (Terraform, Ansible) is essential. They must be adept at designing and monitoring complex distributed systems, utilizing observability tools for metrics, logging, and tracing. Strong understanding of networking, security principles, and database management is crucial. Crucially, an Expert SRE possesses exceptional problem-solving abilities, strategic thinking, and stellar communication skills to translate technical concepts for diverse audiences and lead cross-functional initiatives. Common requirements for these senior positions generally include a bachelor's degree in computer science or a related field (or equivalent experience), coupled with 8+ years of relevant work experience in systems engineering, software development, or DevOps/SRE roles. A proven track record of designing, implementing, and supporting large-scale production environments is non-negotiable. Leadership experience, either through formal mentorship or technical leadership on major projects, is often expected. If you are passionate about building resilient systems, driving operational excellence through code, and providing technical leadership, exploring Expert Site Reliability Engineer jobs could be your next career step.

Filters

×
Countries
Category
Location
Work Mode
Salary