Software Engineer, Reliability Platforms Job at DoorDash (San Francisco)

Job Description

The Reliability Platform role is a key pillar of DoorDash’s Production Lifecycle team, alongside Observability and Deploy Platform. This group’s mandate is to enable users and agents to reason about the health of our services, facilitate change control safety, and provide the means to rapidly address any unexpected state. Ownership is fundamental in DoorDash culture, and all teams own what they build. We are not here to operate services on others’ behalf, but to provide tools that enable their success and ensure a consistently high level of quality for everything we do. We approach challenges with the pragmatic perspective of an SRE, and deliver solutions with the mindset of a SWE who detests toil and repetitive tasks. We use software and agents to “keep the lights on” and focus our energy on innovation that will level up the entire organization. This mission falls into three main categories: Service Health – Providing SLO frameworks, analytics tools, and AI Agent enablement to extract high quality insights from our telemetry to pinpoint faults, or highlight deficiencies; Change Orchestration – Provide self-service provisioning orchestration, evolving from UI to Agent-driven to allow our developers to safely affect production from their IDE; Incident Management – Define and deliver tools/processes/policies leveraged by our peers to quickly understand and recover from any unexpected issues in the environment. As a Software Engineer on the Reliability Platform team, you’ll help design, build, and operate services and infrastructure that deliver on the team’s broad mandate described above.

Job Responsibility

Design, build, and operate services and infrastructure that deliver on the team’s broad mandate
Deliver innovative capabilities
Build great infrastructure
Balance practical and possible
Be custom obsessed
Automate everything
Shape the future of operations

Requirements

5+ years of experience in an infrastructure, platform, or backend engineering role
Fluent in Go (or a similar language)
Comfortable with AWS primitives, security best practices, containerization, and Infrastructure as Code tools like Terraform or Pulumi
Understands concepts like SLOs, error budgets, and incident response
Platform Engineering Mindset: You think in terms of APIs, abstractions, and workflows
Backend Development Skills
Cloud/Infra Expertise
SRE Experience
Flexibility
AI Alignment: You embrace the use of AI tools to be a more productive and capable engineer
Curiosity About the Future: You’re excited about automation and agentic, AI-assisted operations

What we offer

401(k) plan with employer matching
16 weeks of paid parental leave
wellness benefits
commuter benefits match
paid time off
paid sick leave
medical, dental, and vision benefits
11 paid holidays
disability and basic life insurance
family-forming assistance
mental health program
flexible paid time off/vacation
80 hours of paid sick time per year

DoorDash - All Job Offers

Select Country

Software Engineer, Reliability Platforms

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?

Software Engineer, Reliability Platforms

Senior Software Engineer / Principal Software Engineer - Copilot CLI

Senior Software Engineer and Software Engineer II

Software engineer 2 / Senior Software engineer - Azure Data

Senior Software Engineer and Principal Software Engineer

Software Engineer II and Senior Software Engineer

Software Reliability Engineer

Software Engineer - Reliability

Staff Software Engineer, Reliability

Our AI answers in your language