This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Director SRE & Operations for E-business / Digital at PUMA in Herzogenaurach, Germany. Responsible for leading global Site Reliability Engineering and Technology Operations strategy, ensuring platform reliability and performance, and managing cloud infrastructure and operational excellence.
Job Responsibility:
Leadership: Responsible for all aspects of the performance management and professional development of the team, including recruitment, development plans, providing constructive feedback, appraisals and exit processes
Foster a positive and inclusive team culture by actively engaging team members, promoting open communication, and implementing initiatives that enhance employee satisfaction and well-being
Compliance with and implementation of legal and operational requirements regarding occupational health and safety within your own area of responsibility
Global Site Reliability & Operations Strategy: Define and execute a global Site Reliability Engineering (SRE) and Technology Operations strategy aligned with PUMA’s D2C growth, peak trading demands, and omnichannel ambitions
Establish reliability, availability, performance, and scalability targets across all D2C platforms (eCommerce, in-store integrations, APIs, data platforms)
Own the end-to-end operational health of consumer-facing and business-critical platforms
Platform Reliability, Resilience & Performance: Drive a reliability-first mindset across engineering, embedding SRE principles such as SLIs, SLOs, SLAs, error budgets, and resilience-by-design
Ensure platforms are engineered to handle peak events (campaigns, drops, seasonal peaks) with minimal risk and rapid recovery
Lead incident management, major incident response, root cause analysis, and post-incident reviews with a strong focus on learning and prevention
Continuously improve platform observability, monitoring, alerting, and performance management
Cloud, Infrastructure & Environment Operations: Own the operational model for cloud infrastructure, environments, and platform services (prod, non-prod, CI/CD pipelines)
Partner with Architecture and Engineering teams to ensure infrastructure and platform choices support reliability, security, and cost efficiency
Drive infrastructure automation, scalability, and standardisation across global platforms
Balance performance, availability, and cost through strong FinOps practices
Operational Excellence & Engineering Enablement: Build and lead global SRE and Operations teams, setting clear roles, ways of working, and on-call models
Enable product and engineering teams to operate reliably at scale through shared tooling, standards, and operational best practices
Shift operations left by embedding operational concerns early in design, development, and release processes
Define and own runbooks, operational playbooks, and service ownership models
Security, Risk & Compliance (in partnership): Partner closely with Security, Privacy, and Risk teams to ensure platforms meet compliance, data protection, and security requirements
Ensure operational readiness for audits, penetration testing, and regulatory requirements relevant to global digital commerce
Reduce operational risk through proactive monitoring, capacity planning, and failure scenario testing
Vendor Management & Continuous Improvement: Manage strategic partners and vendors supporting infrastructure, cloud services, and operations
Continuously assess new technologies, tools, and practices to improve reliability, speed, and operational efficiency
Define and track KPIs for platform health, availability, incident response, and operational maturity
Requirements:
10–15 years of experience in technology operations, site reliability engineering, or platform engineering within large-scale digital or eCommerce environments
Proven track record owning platform reliability, availability, and operational performance for consumer-facing systems
Strong experience with cloud infrastructure, incident management, observability, and operational readiness in high-traffic, peak-driven environments
Demonstrated ability to embed SRE practices (SLOs, SLIs, incident response, automation) across engineering teams
Experienced leader of global operations or SRE teams, comfortable working in on-call and 24/7 operational models
Calm, decisive leader with a strong focus on stability, resilience, and continuous operational improvement