This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Delivery Production Engineering is a software engineering team, not a traditional operations or sysadmin team. We solve reliability problems by writing code: orchestration systems, load testing frameworks, automation tooling, and shared libraries. As a Production Engineer, you will blend multiple domains of software engineering to ensure Uber's services run reliably at massive scale improving compute efficiency and accelerating developer productivity for a platform that serves millions of users around the world.
Job Responsibility:
Design, build, and maintain software to increase the reliability, scalability, and efficiency of thousands of stateless and stateful production services spread across multiple datacenter zones and regions
Lead initiatives end-to-end within the team, the Production Engineering org, and across engineering at large to increase reliability through automation, setting standards, developer tooling, and reusable frameworks
Work with other engineers to deeply understand their services and guide them towards practical and reliable architecture and implementation
Apply SRE concepts such as observability, integration/load/chaos testing, on-call, incident management, failovers, and disaster recovery to improve mean time between failures (MTBF), time to detection (TTD), and time to mitigation (TTM) of incidents
Participate in on-call rotations, responding to and leading mitigation of production incidents, and driving post-incident improvements
Requirements:
8+ years of experience in Go, Java, Python, or similar language
Experience in delivering solutions end-to-end from defining problems to generating architecture plans, implementation, testing, and delivery
Writes clear technical proposals and RFCs
able to drive engineering alignment across teams through written design docs and verbal discussion
Nice to have:
Experience in various parts of SRE / reliability engineering / incident management at a large-scale company
Experience in platform/infrastructure engineering in related disciplines such as compute platform, software networking, online storage, developer platform, and observability
Experience mentoring and leading teams on projects while remaining hands-on and technical
What we offer:
Eligible to participate in Uber's bonus program
May be offered an equity award & other types of comp