Engineering Manager, Infrastructure Engineering Job at Whatnot (Kraków)

Job Description

This is not a traditional SRE or DevOps role. Whatnot's Reliability Engineering team is a software engineering team that builds the distributed systems, frameworks, and developer-facing tools that make reliability a built-in property of the platform. Think of it as platform engineering for reliability: the team designs and ships software that other engineers use every day to build, test, and operate their services with confidence. As a senior leader in our Infrastructure organization, you will own the technical direction and execution of the Reliability Engineering team. The team's mandate spans SLOs, observability, load testing, resilience testing, incident response, and traffic control mechanisms. You will partner closely with engineering teams across the company to define reliability standards, accelerate detection and mitigation of issues, and ensure Whatnot's systems remain reliable, scalable, and performant as we grow. This role carries significant leadership scope. Depending on the candidate, this person may also take on responsibilities as the Infrastructure engineering lead for Poland and a broader site leadership role for Whatnot Poland, either immediately or as the Poland presence scales.

Job Responsibility

Lead and mentor a team of highly skilled software engineers, supporting their technical growth, execution, and long-term career development
Set technical direction and quality standards for the team while empowering senior ICs to own design and architecture decisions
Develop and execute the strategic roadmap for reliability engineering at Whatnot
Build and operationalize best practices that empower product and platform teams to design and run reliable systems
Own the strategic roadmap for reliability tooling, including incident response systems, SLO measurement platforms, and developer-facing reliability libraries
Lead the team in designing and building traffic control systems as reusable platform components
Lead the design and execution of load testing at scale
Drive continuous improvement in incident detection and mitigation
Collaborate with cross-functional teams to influence product and architectural decisions that improve overall reliability and customer impact
Partner with Infrastructure and Engineering leadership to shape reliability strategy and investment priorities across the organization
Build a culture of learning and continuous improvement through blameless incident analysis, proactive reliability investment, and systematic reduction of repeated failure patterns
Scale the team through hiring, mentorship, leadership development, and thoughtful organizational design

Requirements

10+ years of experience in infrastructure or platform engineering
5+ years managing engineering teams
Experience leading managers or multiple teams a plus
Proven track record building and operating large-scale distributed systems with strong reliability, observability, and incident response practices
Deep technical grounding in one or more of: SLO design, monitoring/alerting, incident tooling, traffic control mechanisms, load and chaos testing, or platform engineering
Experience leading teams that ship developer-facing platforms, frameworks, or internal tools
Strong software engineering fundamentals
Demonstrated ability to guide teams through complex system challenges, large-scale migrations, and longer-term reliability initiatives
Exceptional communication and leadership skills
A passion for enabling teams to build fast while building safely through well-designed tooling and proactive detection mechanisms
Experience leading multiple teams, managing managers, or serving as a site lead is a plus

Nice to have

Experience leading multiple teams, managing managers, or serving as a site lead is a plus

Whatnot - All Job Offers

Select Country

Engineering Manager, Infrastructure Engineering

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?