Senior Site Reliability Engineer - GM Motorsports Job at General Motors (Austin)

Job Description

We are hiring a Senior Site Reliability Engineer (SRE) to join the GM Motorsports Software Engineering Data Platform team. This team builds and operates the next-generation data infrastructure that powers analytics, simulation, and telemetry insights across GM’s racing programs including Formula 1, NASCAR, IndyCar, and IMSA. As a foundational member of the reliability function within the Data Engineering organization, you will ensure the availability, performance, and resilience of high-throughput telemetry and analytics platforms that ingest, process, and deliver mission-critical motorsports data. Our environment handles high-frequency streaming telemetry, simulation outputs, and engineering datasets that must be reliable, observable, and scalable. You will play a key role in designing systems where resilience, automation, and observability are built in from the start. We are looking for engineers who are uncomfortable with manual toil and are driven to build platforms where scaling, recovery, and operational insight are inherent properties of the system architecture.

Job Responsibility

Design and implement reliability practices across the motorsports data platform, including Service Level Indicators (SLIs), Service Level Objectives (SLOs), and error budgets for streaming and analytics workloads
Ensure reliability and performance of high-throughput streaming and batch data pipelines supporting telemetry ingestion, analytics processing, and simulation workloads using technologies such as Kafka, Flink, and Databricks
Build and maintain comprehensive observability frameworks including metrics, logs, and tracing across the platform. Develop dashboards, alerts, and automated responses that detect system degradation before it impacts engineering workflows
Drive the automation of platform infrastructure using Infrastructure as Code (IaC) and platform engineering best practices to enable consistent, reproducible environments across development, testing, and production
Identify operational friction and eliminate manual processes by implementing self-healing infrastructure, automation frameworks, and developer self-service capabilities
Own the reliability of data ingestion, transformation, and storage layers, ensuring stable and performant integration across distributed data systems
Continuously evaluate platform performance and scalability, ensuring the data platform can support high-frequency telemetry ingestion, real-time analytics, and large-scale historical analysis
Provide mentorship and peer review to engineers across the platform team, promoting strong operational discipline, resilient system design, and high-quality engineering practices

Requirements

Proven experience in Site Reliability Engineering (SRE), DevOps, or Platform Engineering supporting large-scale distributed systems
Strong experience with Linux systems administration and cloud-native infrastructure
Experience operating high-throughput data platforms or streaming systems (Kafka, Flink, Spark, etc.)
Hands-on experience with Infrastructure as Code tools such as Terraform or similar frameworks
Experience implementing observability stacks (Prometheus, Grafana, OpenTelemetry, Datadog, etc.)
Strong debugging and troubleshooting skills across distributed systems
Ability to break down complex reliability challenges into clear, implementation-ready initiatives
A growth mindset and commitment to continuous learning in a fast-paced engineering environment

Nice to have

Experience supporting data engineering platforms or analytics infrastructure
Experience with Kubernetes and container orchestration platforms
Familiarity with stream processing frameworks (Apache Flink, Spark Streaming, etc.)
Experience with real-time telemetry, simulation, or high-frequency data environments
Experience implementing reliability practices across multi-cloud or hybrid cloud platforms

What we offer

Relocation benefits may be eligible

General Motors - All Job Offers

Select Country

Senior Site Reliability Engineer - GM Motorsports

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

Senior Site Reliability Engineer - GM Motorsports

Software engineer

Social Worker – Fostering - Family and Friends Team

Spanish Speaking Caregiver

Early Years Consultant

Head of Internal Audit

Caregiver

Social Worker – Under 16s Team

Solutions Engineer

Our AI answers in your language