This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Accountable for the technical strategy, architecture, and engineering execution of resiliency and recoverability across Marriott’s global technology estate - spanning AWS, Azure, Alibaba, hybrid cloud, on-premises, and partner-hosted workloads supporting hundreds of properties worldwide
Own the architectural roadmap for engineered, continuously tested resilience across the most critical revenue-supporting platforms
Serve as the single technical leader unifying resiliency (preventative, design-time) and recoverability (operational, response-time) under a single coherent strategy
Partner with major modernization and consolidation programs to ensure new and migrating platforms are recoverable by design, with repeatable failover and verified transaction success for prioritized critical workloads
Establish and chair architectural standards, production readiness criteria, and resiliency review gates that govern how new and changed systems enter production
Breaks down complex technical problems and drives to the best technical decision based on high level of communication, debate, discussion within the team and with other subject matter experts
Performs research in technologies that are emerging in the industry as a competitive advantage and reports on that research in terms of business opportunities
Advises on viability of emerging technologies for the business
articulates the risks, costs, and ROI
Provides guidance to improve operational processes and procedures to improve service, reduce costs, and leverage technologies
Lead and develop a small team of senior engineers focused on resiliency and recoverability, while operating as a force multiplier across the broader engineering organization
Requirements
Bachelor’s degree in Computer Science, Engineering, Information Systems, or a related discipline - or equivalent professional experience and certifications
8+ years of progressive experience in systems, infrastructure, cloud, or platform engineering within a large enterprise environment, including: 5+ years specifically in resiliency engineering, disaster recovery, or reliability engineering at scale
Demonstrated experience as a senior technical authority - architect, principal engineer, or technical director - for enterprise resiliency and/or disaster recovery programs and for live recovery events
Proven experience designing and validating end-to-end DR and high-availability architectures for enterprise-scale workloads across cloud (AWS, Azure, GCP, or Alibaba), hybrid, and on-premises environments
Experience aligning technical recovery designs to business recovery objectives (RTO, RPO, business criticality) and translating between business impact and technical implementation
Deep working knowledge of cloud-native resiliency patterns: multi-AZ and multi-region designs, redundancy and fault tolerance, automated failover, dynamic traffic management, and adaptive connectivity
Strong recoverability foundation: backup and restore integrity, immutable and versioned backup, ransomware recovery frameworks, isolated recovery environments, and cross-region recovery patterns
Familiarity with infrastructure-as-code and automation tooling (e.g., Terraform, Ansible, CloudFormation) applied to DR orchestration, validation, and drift detection
Experience with containerized and distributed systems, including Kubernetes, service mesh, and platform-level resiliency patterns
Demonstrated ability to influence and drive accountability across a highly matrixed organization without direct authority - across application, infrastructure, cloud, network, SRE, security, and vendor teams
Excellent written, verbal, and executive communication skills
able to translate resiliency posture, risks, and tradeoffs for technical stakeholders, executives, and auditors alike
Nice to have
Graduate Degree in a technical discipline
Experience operating in a global, multi-region enterprise environment with hybrid, cloud, and on-premises platforms and a complex partner/vendor ecosystem
Direct experience standing up or maturing chaos engineering, fault injection, or game-day programs in production environments
Experience with active-active architectures and zero-failover design patterns for mission-critical revenue paths
Familiarity with advanced observability - health modeling, distributed tracing, SLI/SLO design - and tooling such as Dynatrace, Splunk, Cribl, or ThousandEyes
Experience partnering with security teams on ransomware protection, isolated recovery environments, and recovery validation
Familiarity with industry frameworks and standards for resiliency, recoverability, and operational resilience (NIST, ISO 22301, ISO 27031, BCM Institute ORMM, Veeam/McKinsey DRMM)
Relevant certifications: AWS Certified Solutions Architect – Professional, Azure Solutions Architect Expert, Google Cloud Professional Architect, CBCP, DRII, ISO 22301 Lead Implementer, or CISSP
Experience in hospitality, travel, retail, or other industries with distributed property/store technology footprints and 24x7 guest- or customer-facing transactions
Prior experience leading or contributing to a technology consolidation or modernization program of significant scale