CrawlJobs Logo

Engineering Manager Observability

United States, Austin Employment contract 218800.00 - 335300.00 USD / Year · Job Posted June 30, 2026
Apply Position
Job Link Share

Job Responsibility

  • Team leadership: Manage and grow a team of engineers, conducting performance reviews, providing coaching, and supporting career development
  • Technical strategy: Define and execute the technical vision and roadmap for the observability platform, ensuring it provides actionable insights into complex systems
  • Architectural oversight: Provide technical guidance on instrumentation, logging, metrics, and tracing to ensure comprehensive visibility across GM’s AV software stack
  • Incident response: Ensure the team's tools enable rapid detection, debugging, and resolution of unknown or unforeseen system failures to minimize downtime
  • Cross-functional collaboration: Work with other engineering teams—such as those developing AI/ML, firmware, and infrastructure—to implement observability practices and improve system reliability
  • Platform development: Lead the development of internal tools and data pipelines to collect, analyze, and visualize telemetry data at a massive scale
  • Vendor management: Manage relationships and costs associated with third-party observability software and platforms

Requirements

  • 7+ years of proven leadership experience managing software or site reliability engineering (SRE) teams
  • Deep understanding of core observability pillars: logs, metrics, and traces. Experience with technologies like Prometheus, Grafana, OpenTelemetry, and log management systems is crucial
  • Strong background in designing, developing, and architecting distributed systems, cloud-native applications, and microservices
  • Familiarity with Go, Python, Typescript or similar along with software development practices to inform code reviews and architectural decisions
  • Experience with modern cloud offerings like GCP, AWS, or Azure and technologies like CI/CD pipelines, Kubernetes, and Docker
  • Excellent interpersonal and communication skills to collaborate effectively with diverse teams and stakeholders
  • Experience working with GCP, AWS, or Azure
  • Familiarity with Kubernetes, Docker, Istio, Terraform, Prometheus, Grafana, TSDBs and observability pipelines (e.g. either for logging or metrics or tracing)
  • Skilled in defining and instrumenting SLIs and SLOs
  • Own or contribute to Open Source projects
  • Passion for self-driving technology and its potential impact on the world

Nice to have

  • Experience working with GCP, AWS, or Azure
  • Familiarity with Kubernetes, Docker, Istio, Terraform, Prometheus, Grafana, TSDBs and observability pipelines (e.g. either for logging or metrics or tracing)
  • Skilled in defining and instrumenting SLIs and SLOs
  • Own or contribute to Open Source projects
  • Passion for self-driving technology and its potential impact on the world

What we offer

  • Incentive pay program
  • Health and wellbeing benefit programs including medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts
  • Company vehicle evaluation program
  • Relocation benefits

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Engineering Manager Observability

8 matching positions

Engineering Manager - Observability & Reliability Engineering Obsession

We are looking for an Engineering Manager to join the OREO (Observability Reliab...
Location
Location
France , Paris
Salary
Salary:
Not provided
doctolib.fr Logo
Doctolib
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 5+ years of software engineering or SRE experience, with a strong technical background in cloud-native environments (preferably AWS, GCP, and/or Kubernetes-based)
  • 3+ years of engineering management experience, leading technical teams (ideally SRE, platform, or infrastructure teams)
  • Deep understanding of observability tooling and architecture (Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Prometheus, Thanos, Datadog)
  • Experience with infrastructure as code (Terraform, OpenTofu) and secrets management systems (Vault, AWS Secrets Manager)
  • Proven ability to balance technical depth with people leadership, able to mentor engineers, review technical designs, and guide architectural decisions
Job Responsibility
Job Responsibility
  • Lead, coach, and grow a team of Site Reliability Engineers, supporting their technical development and career progression
  • Create a culture of operational excellence, continuous improvement, and psychological safety within the team
  • Conduct regular 1:1s, performance reviews, and career development conversations
  • Recruit, onboard, and retain top SRE talent aligned with Doctolib's mission and values
  • Partner with SREs and senior engineers to define and evolve the observability strategy across the platform, focusing on logging, metrics, tracing, and alerting
  • Own the strategy and evolution of critical transversal services including HashiCorp Vault and Terraform Enterprise
  • Drive prioritization and roadmap planning for large-scale reliability and observability initiatives
  • Ensure alignment between team objectives and broader engineering and business goals
  • Advocate for and allocate resources toward reducing technical debt and improving developer experience
  • Own the team's on-call experience and contribute to the incident response processes, ensuring sustainable practices and continuous improvement
What we offer
What we offer
  • Free comprehensive health insurance for you and your children
  • Parent Care Program: receive one additional month of leave on top of the legal parental leave
  • Free mental health and coaching services through our partner Moka.care
  • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
  • Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
  • Work Council subsidy to refund part of sport club membership or creative class
  • Up to 14 days of RTT
  • A subsidy from the work council to refund part of the membership to a sport club or a creative class
  • Lunch voucher with Swile card
  • Fulltime
Read More
Arrow Right

Engineering Manager - Observability & Reliability Engineering Obsession

We are looking for an Engineering Manager to join the OREO (Observability Reliab...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
doctolib.fr Logo
Doctolib
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 5+ years of software engineering or SRE experience, with a strong technical background in cloud-native environments (preferably AWS, GCP, and/or Kubernetes-based)
  • 3+ years of engineering management experience, leading technical teams (ideally SRE, platform, or infrastructure teams)
  • Deep understanding of observability tooling and architecture (Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Prometheus, Thanos, Datadog)
  • Experience with infrastructure as code (Terraform, OpenTofu) and secrets management systems (Vault, AWS Secrets Manager)
  • Proven ability to balance technical depth with people leadership, able to mentor engineers, review technical designs, and guide architectural decisions
Job Responsibility
Job Responsibility
  • Lead, coach, and grow a team of Site Reliability Engineers, supporting their technical development and career progression
  • Create a culture of operational excellence, continuous improvement, and psychological safety within the team
  • Conduct regular 1:1s, performance reviews, and career development conversations
  • Recruit, onboard, and retain top SRE talent aligned with Doctolib's mission and values
  • Partner with SREs and senior engineers to define and evolve the observability strategy across the platform, focusing on logging, metrics, tracing, and alerting
  • Own the strategy and evolution of critical transversal services including HashiCorp Vault and Terraform Enterprise
  • Drive prioritization and roadmap planning for large-scale reliability and observability initiatives
  • Ensure alignment between team objectives and broader engineering and business goals
  • Advocate for and allocate resources toward reducing technical debt and improving developer experience
  • Own the team's on-call experience and contribute to the incident response processes, ensuring sustainable practices and continuous improvement
What we offer
What we offer
  • Free comprehensive health insurance for you and your children
  • Parent Care Program: receive one additional month of leave on top of the legal parental leave
  • Free mental health and coaching services through our partner Moka.care
  • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
  • Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
  • Work Council subsidy to refund part of sport club membership or creative class
  • Up to 14 days of RTT
  • A subsidy from the work council to refund part of the membership to a sport club or a creative class
  • Lunch voucher with Swile card
  • Fulltime
Read More
Arrow Right

Engineering Manager, Developer Experience & Observability

We don’t just build APIs; we build the lens through which thousands of global me...
Location
Location
United States , San Francisco
Salary
Salary:
215000.00 - 320000.00 USD / Year
adyen.com Logo
Adyen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of technical experience, with a significant background as a hands-on Software Engineer
  • 3+ years in a leadership role, managing high-performing engineering teams while maintaining boots on the ground technical literacy
  • Proven track record in building developer-facing tools, observability platforms, or high-traffic API products
  • Experience in payments or building external-facing developer consoles/dashboards (plus)
  • Deeply comfortable with Java, SQL, and distributed systems
  • Experience with large-scale systems
  • Work authorized in the United States without the need for new visa sponsorship
Job Responsibility
Job Responsibility
  • Maintain a strong product mindset, move beyond standard monitoring to build Customer-Centric Observability
  • Participate in architectural designs and code reviews, ensuring data tech stack delivers near real-time insights at scale
  • Lead team to ship functionalities within weeks, impacting millions of transactions globally
  • Coach, mentor, and build a high-performing team of Backend and Frontend engineers
  • Fulltime
Read More
Arrow Right

Engineering Manager, Developer Platform & Observability

As the Engineering manager for developer observability, you will lead a speciali...
Location
Location
United States , San Francisco
Salary
Salary:
215000.00 - 320000.00 USD / Year
adyen.com Logo
Adyen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of technical experience, with a significant background as a hands-on Software Engineer
  • 3+ years in a leadership role, managing high-performing engineering teams while maintaining 'boots on the ground' technical literacy
  • Proven track record in building developer-facing tools, observability platforms, or high-traffic API products
  • Experience in payments or building external-facing developer consoles/dashboards
  • You are deeply comfortable with Java, SQL, and distributed systems
  • You have experience with large-scale systems but prioritize the product knowledge and customer experience over technical complexity for its own sake
Job Responsibility
Job Responsibility
  • Maintain a strong product mindset, move beyond standard monitoring to build 'Customer-Centric Observability'
  • Participate in architectural designs and code reviews
  • Act as a technical leader and domain expert for your teams products
  • Lead your team to ship functionalities within weeks
  • Coach, mentor, and build a high-performing team of Backend and Frontend engineers
  • You will be responsible for continuous performance management, professional development, and maintaining a culture of rapid, top-notch execution
What we offer
What we offer
  • RSUs
  • Fulltime
Read More
Arrow Right

Engineering Manager, Production Engineering

We're looking for a hands-on Engineering Manager to lead our Production Engineer...
Location
Location
United States , San Francisco
Salary
Salary:
209000.00 - 253000.00 USD / Year
crusoe.ai Logo
Crusoe
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software or infrastructure engineering experience, with at least 1–2 years in an engineering management or tech lead role
  • Strong SRE or production engineering background — hands-on experience with incident management, SLO frameworks, runbooks, and on-call operations
  • Solid coding ability
  • comfortable writing production-grade code in Go, Python, or similar languages to build tooling and automation
  • Experience working with or embedding into cross-functional product teams, and influencing engineering decisions across organizational boundaries
  • Familiarity with container orchestration and cloud-native infrastructure — Kubernetes, distributed systems, and cloud service architectures
  • Strong communication skills — able to clearly represent technical risk and operational status to both engineering peers and business stakeholders
Job Responsibility
Job Responsibility
  • Leading and growing a team of SREs embedded within Crusoe's AI product areas, setting technical direction and fostering a culture of ownership and continuous improvement
  • Contributing as an IC — reviewing code, building tooling, and driving automation to reduce toil and improve the reliability and scalability of production services
  • Owning SLA/SLO performance, incident response, and on-call health for service offerings
  • leading blameless post-mortems and driving systemic remediation
  • Partnering with embedded product and platform engineering teams to influence infrastructure design, observability strategy, and operational readiness for new and existing services
  • Defining and tracking reliability, performance, and operational maturity metrics across the team
  • translating data into prioritized roadmap investments
  • Serving as a technical escalation point for high-severity production incidents affecting enterprise customers, and collaborating with Cloud Support and Customer Success on resolution and communication
What we offer
What we offer
  • Industry competitive pay
  • Restricted Stock Units in a fast growing, well-funded technology company
  • Health insurance package options that include HDHP and PPO, vision, and dental for you and your dependents
  • Employer contributions to HSA accounts
  • Paid Parental Leave
  • Paid life insurance, short-term and long-term disability
  • Teladoc
  • 401(k) with a 100% match up to 4% of salary
  • Generous paid time off and holiday schedule
  • Cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Engineering Manager, Infrastructure Engineering

This is not a traditional SRE or DevOps role. Whatnot's Reliability Engineering ...
Location
Location
Poland , Kraków
Salary
Salary:
Not provided
whatnot.com Logo
Whatnot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in infrastructure or platform engineering
  • 5+ years managing engineering teams
  • Experience leading managers or multiple teams a plus
  • Proven track record building and operating large-scale distributed systems with strong reliability, observability, and incident response practices
  • Deep technical grounding in one or more of: SLO design, monitoring/alerting, incident tooling, traffic control mechanisms, load and chaos testing, or platform engineering
  • Experience leading teams that ship developer-facing platforms, frameworks, or internal tools
  • Strong software engineering fundamentals
  • Demonstrated ability to guide teams through complex system challenges, large-scale migrations, and longer-term reliability initiatives
  • Exceptional communication and leadership skills
  • A passion for enabling teams to build fast while building safely through well-designed tooling and proactive detection mechanisms
Job Responsibility
Job Responsibility
  • Lead and mentor a team of highly skilled software engineers, supporting their technical growth, execution, and long-term career development
  • Set technical direction and quality standards for the team while empowering senior ICs to own design and architecture decisions
  • Develop and execute the strategic roadmap for reliability engineering at Whatnot
  • Build and operationalize best practices that empower product and platform teams to design and run reliable systems
  • Own the strategic roadmap for reliability tooling, including incident response systems, SLO measurement platforms, and developer-facing reliability libraries
  • Lead the team in designing and building traffic control systems as reusable platform components
  • Lead the design and execution of load testing at scale
  • Drive continuous improvement in incident detection and mitigation
  • Collaborate with cross-functional teams to influence product and architectural decisions that improve overall reliability and customer impact
  • Partner with Infrastructure and Engineering leadership to shape reliability strategy and investment priorities across the organization
  • Fulltime
Read More
Arrow Right

Senior Engineering Manager, Platform Engineering (Developer Experience)

Everlaw is seeking a Senior Engineering Manager, Platform to lead teams focused ...
Location
Location
United States , Oakland, California
Salary
Salary:
219000.00 - 277000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 5 years as a senior engineer building developer productivity tools and/or highly available platform services (e.g., storage, pub-sub, search, caching, observability) and/or deep experience with infrastructure/cloud technologies (e.g., Terraform, Kubernetes, Docker)
  • 3+ years of experience directly managing software engineers and/or technical leads, including hiring, coaching, performance management, and growing a high-performing team
  • 2+ years of experience building and leading developer experience or platform teams/programs that deliver internal platforms and tooling with measurable productivity outcomes (e.g., faster builds/tests, improved CI/CD lead times, higher deployment frequency)
  • Experience managing scalable database infrastructure (e.g., Postgres, MySQL or equivalent)
  • Can communicate at the right altitude with both technical and non-technical stakeholders, and you’ve led cross-functional roadmaps with Engineering Operations, Security Engineering, DevOps, Product, and Design
  • Authorized to work in the United States. Please note that currently, Everlaw is not sponsoring employment visas.
Job Responsibility
Job Responsibility
  • Lead platform teams that build and evolve core internal platforms and developer tooling—spanning build/test infrastructure, CI/CD, and developer workflows—to improve engineer productivity and time-to-value
  • Collaborate closely with Engineering Operations, Security Engineering, DevOps, Product, and Design to synthesize requirements and prioritize impactful investments
  • Drive roadmapping, resourcing, and execution for critical platform areas that make it better and cheaper to develop, test, and release software
  • Establish and use developer efficiency metrics (e.g., build/test times, deploy lead time, change failure rate) to identify bottlenecks and plan ambitious improvements to workflows
  • Ensure operational excellence for platform services and tooling with clear SLOs, robust observability, and incident/bug management practices
  • Coach and develop engineers and leads
  • provide actionable feedback, elevate technical execution, and foster an inclusive, high-accountability culture
  • Partner with Engineering Operations to improve processes for alignment, goal setting, empowerment, and cross-team execution across Engineering
  • Communicate effectively with both technical and non-technical stakeholders, adjusting altitude from strategy to technical deep dives as needed.
What we offer
What we offer
  • Medical
  • dental
  • wellness program
  • paid parental leave
  • professional development
  • fully stocked kitchen
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Fulltime
Read More
Arrow Right

Engineering Manager, Product Engineering

Engineering is the backbone of Everlaw. We build features that delight our custo...
Location
Location
United States , Oakland
Salary
Salary:
198000.00 - 250000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS/MS or PhD in Computer Science (or equivalent)
  • Sound foundational understanding of a wide range of computer science topics and concerns relating to system and software design
  • At least 5 years of experience as a senior engineer building product features and full-stack web applications
  • Good dynamic range that you apply to different situations - you can step back and empower, while also diving deep into the code to understand the details
  • Ability to communicate at the right altitude with both technical and non-technical stakeholders
  • Experience working with stakeholder teams (internal and/or external) in setting and collaborating on technical roadmaps
  • Experience communicating with customers articulating to them how the platform works on reliability, security and compliance matters
  • At least 1 year experience leading software engineers - either as a manager managing engineers or as a technical lead managing the technical workstreams of software engineers
  • Experience managing the technical workstreams of software engineers and supporting them in execution
  • Demonstrated ability to lead an inspired, high performing and highly motivated and accountable team
Job Responsibility
Job Responsibility
  • Build features and functionality for the Everlaw core product
  • Work closely with Product, Design, DevOps, Security Engineering and application engineering leads to synthesize requirements and prioritize efforts
  • Lead roadmapping, resourcing and execution for critical features and capabilities
  • Support and coach engineers in their career development and growth
  • Work closely with Engineering Operations team to improve processes to help with goal setting, empowerment and execution across Everlaw Engineering efforts
  • Critically observe and understand Everlaw’s platform, tooling and processes
  • Understand current and upcoming challenges and requirements from the viewpoint of multiple stakeholders
  • Understand company goals and Product roadmaps
  • Strategize, prioritize, resource and execute against features
  • Actively coach your reports to deliver on projects and ensure they get the right types of feedback and coaching they need to succeed in their careers
What we offer
What we offer
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Seventeen paid vacation days plus 11 federal holidays
  • Membership to Modern Health to help employees prioritize mental health and wellness
  • Annual allocation for Learning & Development opportunities and applicable professional membership dues
  • Company-sponsored life and disability insurance
  • Work in Uptown Oakland, just steps from the BART line and dozens of restaurants and walking distance to Lake Merritt
  • Fulltime
Read More
Arrow Right