This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Senior Program Manager, Technology Resilience & Operations Leader responsible for leading and integrating multiple enterprise wide programs across technology resilience, cloud governance and architectural guardrails, third party resilience risk, and the Technology Operations Center. Drives strategy, execution, governance, and metrics to ensure the organization operates safely, reliably, and resiliently across all technology environments.
Job Responsibility:
Lead the enterprise technology resilience program, including strategy, roadmap, execution cadence, and governance
Develop and maintain technology resilience testing frameworks aligned with regulatory, industry, and internal standards
Coordinate with engineering, infrastructure, and application teams to plan and execute resilience, failover, and chaos testing exercises
Establish centralized program oversight for critical asset mapping, scenario design, testing schedules, issue tracking, and remediation management
Define, track, and report resilience metrics, dashboards, test coverage, and issue aging to senior leadership and governance forums
Drive continuous improvement initiatives across disaster recovery, high availability, and fault tolerant design practices
Lead cloud governance and resilience guardrail initiatives in partnership with enterprise architecture, cloud engineering, and risk teams
Define minimum resilience design requirements for cloud native and hybrid solutions, including multi availability zone patterns, automated failover, observability, and dependency management
Program manage the integration of resilience controls into reference architectures, delivery pipelines, and automated policy enforcement
Develop and maintain standards, playbooks, and guidance to support consistent and resilient cloud adoption
Establish and mature a technology led third party resilience risk program
Program manage resilience assessments for critical third party and fourth party providers, including control reviews, dependency mapping, and scenario based testing
Coordinate development and maintenance of contingency and fallback plans for critical provider failures
Partner with Legal to support contract language alignment related to resilience, testing, notification, and performance obligations
Work closely with third party risk management, operational risk, procurement, legal, and business continuity teams to ensure integrated oversight and reporting
Define standardized resilience requirements and evidence expectations for third party engagements
Lead the program design, build out, and ongoing evolution of the Technology Operations Center as an enterprise monitoring and command capability
Define program requirements for operational dashboards, early warning indicators, system health telemetry, and resilience key performance indicators
Coordinate with monitoring and engineering teams to integrate analytics, runbooks, alerting standards, pattern detection, and escalation workflows
Ensure the Technology Operations Center operates as a fully enabled 24 by 7 capability in partnership with enterprise incident management teams
Requirements:
Ten or more years of experience in technology program management, operational resilience, technology risk, cloud engineering, or enterprise technology leadership
Demonstrated experience leading complex, cross functional enterprise programs with regulatory and operational impact
Strong knowledge of technology resilience testing, cloud architecture principles, and observability practices
Experience working with third party risk frameworks, regulatory expectations, and contract control requirements
Prior experience supporting or managing mission critical operational centers such as NOC, TOC, or SOC
Proven ability to influence and drive execution across matrixed organizations without direct authority
Strong communication, stakeholder management, and executive reporting skills
Nice to have:
Experience leading programs supporting AWS, Azure, or Google Cloud resilience architectures
Familiarity with financial services regulatory guidance related to operational resilience and third party risk
Experience with incident management, chaos engineering, fault injection, or site reliability engineering programs
Professional certifications such as AWS Solutions Architect, CRISC, CISSP, CISA, or equivalent