This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Team leadership: Manage and grow a team of engineers, conducting performance reviews, providing coaching, and supporting career development
Technical strategy: Define and execute the technical vision and roadmap for the observability platform, ensuring it provides actionable insights into complex systems
Architectural oversight: Provide technical guidance on instrumentation, logging, metrics, and tracing to ensure comprehensive visibility across GM’s AV software stack
Incident response: Ensure the team's tools enable rapid detection, debugging, and resolution of unknown or unforeseen system failures to minimize downtime
Cross-functional collaboration: Work with other engineering teams—such as those developing AI/ML, firmware, and infrastructure—to implement observability practices and improve system reliability
Platform development: Lead the development of internal tools and data pipelines to collect, analyze, and visualize telemetry data at a massive scale
Vendor management: Manage relationships and costs associated with third-party observability software and platforms
Requirements
7+ years of proven leadership experience managing software or site reliability engineering (SRE) teams
Deep understanding of core observability pillars: logs, metrics, and traces. Experience with technologies like Prometheus, Grafana, OpenTelemetry, and log management systems is crucial
Strong background in designing, developing, and architecting distributed systems, cloud-native applications, and microservices
Familiarity with Go, Python, Typescript or similar along with software development practices to inform code reviews and architectural decisions
Experience with modern cloud offerings like GCP, AWS, or Azure and technologies like CI/CD pipelines, Kubernetes, and Docker
Excellent interpersonal and communication skills to collaborate effectively with diverse teams and stakeholders
Experience working with GCP, AWS, or Azure
Familiarity with Kubernetes, Docker, Istio, Terraform, Prometheus, Grafana, TSDBs and observability pipelines (e.g. either for logging or metrics or tracing)
Skilled in defining and instrumenting SLIs and SLOs
Own or contribute to Open Source projects
Passion for self-driving technology and its potential impact on the world
Nice to have
Experience working with GCP, AWS, or Azure
Familiarity with Kubernetes, Docker, Istio, Terraform, Prometheus, Grafana, TSDBs and observability pipelines (e.g. either for logging or metrics or tracing)
Skilled in defining and instrumenting SLIs and SLOs
Own or contribute to Open Source projects
Passion for self-driving technology and its potential impact on the world
What we offer
Incentive pay program
Health and wellbeing benefit programs including medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts