This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a Sr. Engineering Manager to join our Platform team. You will lead two teams (Reliability & Resilience and Developer Productivity) made up of 7 engineers, with a clear growth path to 10+. You will own strategy, execution, and people leadership across core platform and operational domains including local developer environments, CI/CD, Kubernetes, API gateway, storage and caching, observability, secrets management, cost and capacity management, as well as SaaS vendor relationships . You will report to Sr. Director of Engineering and will be located in the US (preferably NYC or SF; remote considered for exceptional candidates).
Job Responsibility:
Lead and grow a team of highly independent engineers across Reliability & Resilience and Developer Productivity teams
set org structure, hiring plan, and delivery goals
Own the platform roadmap and execution for improvements in development velocity, iteration speed, platform availability, and deployment safety
Build an industry-leading reliability practice: manage SLOs and error budgets, run incident response and postmortems, and prioritize resilience work across critical services
Operate and evolve core platform services including API gateway, storage and caching infrastructure, secrets management, and observability
Manage capacity and cost: forecasting, right-sizing, tuning, and spend governance tied to workload and growth plans
Own key relationships with critical SaaS vendors supporting our platform stack, including evaluation, contracts/renewals, and operational integration
Requirements:
3+ years managing engineers (managing managers is a plus)
Hands-on technical depth in Kubernetes production operations, CI/CD systems
Track record owning key platform dependencies such as API gateways, caches, petabyte-scale KV stores and databases
Demonstrated ownership of reliability programs: SLOs, error budgets, incident response, postmortems, and measurable reductions in downtime
Proven ability to translate business goals into technical strategy and drive cross-org alignment
8+ years building and operating large-scale distributed systems
Track record establishing trust, psychological safety, and clear expectations
skilled at timely, candid feedback
Strong facilitator in technical conflict—you listen, synthesize, decide, and bring the team with you
Nice to have:
Have experience using Vercel platform
What we offer:
Competitive compensation package, including equity
Inclusive Healthcare Package
Learn and Grow - we provide mentorship and send you to events that help you build your network and skills
Flexible Time Off
We will provide you the gear you need to do your role, and a WFH budget for you to outfit your space as needed