Site Reliability Engineer Jobs

Site Reliability Engineer

Tier4 Group

Location:
United States

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Not provided

Save Job

Apply Position

Job Description:

We are seeking a highly skilled Site Reliability Engineer (SRE) with strong observability expertise, proven communication skills, and the ability to drive reliability maturity across multi-team environments. This role is ideal for someone who can blend deep technical proficiency with strategic thinking and collaborative influence.

Job Responsibility:

Design, scale, optimize, and manage Prometheus and Grafana environments
Write advanced PromQL queries, dashboards, visualizations, and metric-based calculations
Build out and maintain Grafana instances, supporting multi-team use cases
Leverage Dynatrace with strong proficiency in metrics and analytics to deliver efficient, actionable observability solutions for engineering and operations teams (e.g., dashboards, insights, reports)
Analyze telemetry data to identify the metrics that matter (MTM), drive actionable insights, and influence engineering decisions
Apply and evolve an SRE Maturity Model to help teams mature across observability, resilience, automation, and reliability
Establish, implement, and maintain Service Level Objectives (SLOs) and error budgets across applications and services
Partner effectively with engineering, product, operations, and leadership teams
translate complex technical insights into clear, actionable communication
Identify and reduce toil through automation, tooling improvements, and process refinement
Support incident analysis, reliability reviews, and continuous improvement initiatives

Requirements:

Familiarity with SRE principles, maturity models, and reliability roadmaps
Demonstrated experience improving application reliability via data-driven decisions
Hands-on experience with Prometheus, Grafana, and PromQL
Strong understanding of Dynatrace, metric analysis, and observability practices
Excellent communication skills and ability to collaborate across diverse technical and non-technical teams
Strong analytical and problem-solving skills with a bias for action

Nice to have:

Experience with Kubernetes, cloud platforms (AWS/GCP/Azure), or CI/CD pipelines
Experience with Automation
Experience with large-scale distributed systems or high-availability architectures

Additional Information:

Job Posted:
December 13, 2025

View All Jobs In This Company

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Welcome to CrawlJobs.com –
Your Global Job Discovery Platform

At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.