Cloud Resilience & Disaster Recovery Engineer Job at Randstad (Melbourne)

Lead Rubrik Backup Engineer IV

The Rubrik Backup Engineer IV is a senior technical specialist responsible for t...

Location

India

Salary:

Not provided

Rackspace

Expiration Date

Until further notice

Requirements

Minimum of 18-20 years of experience in IT infrastructure with at least 4 years of hands-on experience in Rubrik
Proven track record of managing complex enterprise-scale backup environments
Experience with backup and recovery for databases (MSSQL, Oracle), file servers, and virtual machines
Bachelor's degree in Computer Science, Information Technology, or equivalent work experience
Expert knowledge of Rubrik CDM architecture, RBS, Polaris, and Rubrik APIs
Advanced skills in backup for virtualized environments (VMware, Hyper-V)
Strong understanding of file-level, database-level, and VM-level backup and restore operations
Deep knowledge of cloud-native backups and cloud archiving using AWS S3, Azure Blob, and GCP storage
Hands-on experience with integration and automation (e.g., Python, PowerShell, REST API, Terraform, Ansible)
Proficiency in disaster recovery design, planning, and orchestration (DR runbooks)

Job Responsibility

Serve as the highest level of technical escalation for Rubrik-related incidents and issues
Architect and implement Rubrik backup solutions across hybrid, on-premises, and multi-cloud environments (AWS, Azure, GCP)
Lead backup and recovery strategy design sessions for customers, including air-gapped, immutable, and ransomware-resilient architectures
Integrate Rubrik with external systems (e.g., ServiceNow, Splunk, vSphere, Azure AD) using REST APIs and automation tools (Python, Ansible, Terraform)
Design and maintain Rubrik SLA Domains, archival policies (cloud/tape), replication, and compliance workflows
Collaborate with Engineering, Storage, Security, and Application teams to ensure backup consistency and performance
Manage large-scale Rubrik clusters, capacity planning, and software upgrades
Proactively identify and resolve systemic issues across infrastructure that impact backup performance or restore SLAs
Document architectures, runbooks, and SOPs
contribute to technical training and playbooks

Fulltime

Systems Engineering Lead

My client are seeking a highly skilled and motivated Cloud Technical Lead with a...

Location

Ireland , Dublin 1

Salary:

Not provided

Solas IT Recruitment

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, Information Technology, or related field
8 years of experience in cloud engineering, with a strong background in designing and deploying cloud solutions
Expertise with Kubernetes, including hands-on experience in managing and orchestrating containerized applications
Deep understanding of cloud platforms such as AWS, Azure, or Google Cloud, and related services (e.g., EC2, S3, Lambda, GKE, AKS)
Experience with Infrastructure-as-Code (IaC) tools such as Terraform, CloudFormation, or similar
Experience with multi-cloud environments and hybrid cloud architecture
Familiarity with monitoring and logging tools like Prometheus, Grafana, ELK stack, or others
Knowledge of container registries and service meshes (e.g., Istio, Linkerd)
Experience with Agile development methodologies and working in a DevOps culture
Strong proficiency in scripting and automation tools (e.g., Python, Bash, Ansible)

Job Responsibility

Lead the design and implementation of scalable, reliable, and cost-efficient cloud-based solutions using AWS, Azure, Google Cloud, or other cloud platforms
Drive the adoption of Kubernetes and containerization best practices for microservices architecture, including the orchestration, deployment, and management of Kubernetes clusters
Provide technical leadership and mentorship to a team of cloud engineers, ensuring adherence to cloud engineering best practices
Collaborate with software developers, DevOps engineers, and other teams to implement cloud-native applications, automation, and CI/CD pipelines
Ensure cloud infrastructure is secure, resilient, and meets compliance requirements, working closely with security teams to mitigate risks
Optimize cloud infrastructure performance and costs, providing recommendations for improvements and helping track usage
Troubleshoot and resolve technical issues related to cloud infrastructure, Kubernetes clusters, and services
Participate in architecture and design reviews to ensure solutions meet high availability, disaster recovery, and scalability requirements
Stay up to date with the latest cloud technologies, trends, and innovations, and propose enhancements to the cloud infrastructure strategy
Ensure proper documentation of cloud systems, architectures, and processes

Staff Security Engineer, Business Continuity & Disaster Recovery

We're seeking a Business Continuity and Disaster Recovery (BCP/DR) Senior Engine...

Location

India

Salary:

Not provided

AlphaSense

Expiration Date

Until further notice

Requirements

7+ years of hands-on experience with cloud infrastructure (AWS required
GCP/Azure beneficial)
Deep expertise in enterprise backup and recovery solutions (Veeam, Commvault, AWS Backup, or similar)
Strong understanding of cloud storage services (S3, EBS, EFS, RDS, DynamoDB, etc.)
Proficiency with Infrastructure as Code tools (Terraform, CloudFormation, Pulumi)
Experience with containerized environments (ECS, EKS, Docker) and their backup/recovery patterns
Knowledge of database backup and recovery procedures (PostgreSQL, MySQL, MongoDB, etc.)
Understanding of storage technologies, replication methods, and data protection architectures
3+ years of experience in Business Continuity Planning and Disaster Recovery
Proven track record of designing and implementing BCP/DR programs for technology organizations

Job Responsibility

Design and implement comprehensive BCP/DR programs aligned with industry frameworks (ISO 22301, NIST SP 800-34, ISO 27001)
Conduct Business Impact Analyses (BIA) to identify critical business functions, dependencies, and recovery priorities
Define and maintain Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) for all critical systems and services
Develop and maintain disaster recovery playbooks and runbooks for various incident scenarios
Create and manage crisis communication frameworks for security incidents and business disruptions
Lead tabletop exercises and disaster recovery drills to validate recovery procedures
Design and implement backup and recovery solutions for AWS cloud infrastructure (primary focus)
Build automated backup workflows for databases, storage systems, applications, and configurations
Implement immutable backup strategies and offsite replication for ransomware resilience
Monitor backup operations, validate recovery procedures, and maintain backup integrity

Middleware Support Engineer

The Middleware Support Engineer in Allianz Technology Malaysia, Regional Deliver...

Location

Malaysia , Kuala Lumpur

Salary:

Not provided

Allianz

Expiration Date

Until further notice

Requirements

Bachelor’s degree in computer science, information technology, or a related field
Proven experience in a middleware support or system administration role
Strong understanding of middleware technologies such as IBM WebSphere, Oracle WebLogic, JBOSS, Apache Kafka, or similar
Familiarity with cloud-based middleware solutions and integration platforms
Strong analytical and problem-solving skills
Excellent communication and interpersonal skills
Ability to coordinate and work independently and as part of a team
Certifications in middleware technologies or related areas are a plus

Job Responsibility

System Monitoring: Supervising and monitor middleware environments to ensure optimal performance and availability
Issue Resolution: Diagnose and resolve middleware-related issues, including performance bottlenecks, connectivity problems, integration failures and come out RCA and implement solutions
Maintenance and Upgrades: Supervising on regular maintenance, patching and upgrades of middleware systems to ensure they are up-to-date and secure
Support and Troubleshooting: Supervising and provide technical support to application developers and IT teams regarding middleware-related queries and issues
Vendor Management: Collaborate with vendors and service providers to evaluate new technologies and manage procurement processes
Documentation: Maintain accurate documentation of middleware configurations, processes, and issue resolutions
Collaboration: Supervising and work closely with application development teams and IT support staff to ensure seamless integration and operation of middleware solutions
Security Management: Implement and maintain security measures to protect middleware environments and ensure data integrity
Continuous Improvement: Identify opportunities to optimize middleware performance and improve support processes
Cloud Management: Plan, manage, and monitor cloud-based infrastructure. Implement and manage cloud security measures to protect data and systems

Fulltime

Director, North America Infrastructure Operations & Reliability

Alimentation Couche-Tard (Circle K) seeks a highly experienced, driven, and dyna...

Location

United States of America , Tempe

Salary:

Not provided

Circle K

Expiration Date

Until further notice

Requirements

Minimum of 10 years of demonstrated progressively responsible experience and successful Infrastructure and operations management of distributed global platforms
strong ability to identify needs, take initiative, and prioritize work efforts, balancing operational tasks with longer-term strategic security efforts
proven success in establishing key performance indicators, metrics, and focus to drive operational/service delivery best practices
meticulous planning skills with a balance of risk management and efficient execution
establish and balance priorities between new initiatives and sustaining operations engineering work
ability to establish and maintain trust and rapport with the team and external constituents
experience leading and developing multiple team members and managed service providers
strong knowledge and understanding of infrastructure operations and reliability best practices in a high-volume and critical production service environment
experience managing vendor relationships for all infrastructure services and solutions and reviewing vendor contracts, statements of work, and related documents
experience in DevOps and Infrastructure and Application migration to cloud

Job Responsibility

Lead a multi-disciplinary North America focused team, in close partnership with managed service providers, to establish roadmaps and successful implementation of technology standards, including hosting, network, storage, workplace, desktop, and other datacenter infrastructure
build strong relationships with company leaders and departments across the organization to understand the business, share knowledge, and foster a collaborative, supportive environment when recommending technology solutions to meet business objectives
partner with cybersecurity and risk management teams to ensure the infrastructure meets security requirements and evolves over time to meet changing needs and best practices
drive application migration to the cloud, embedding DevOps and observability tooling to enhance delivery and monitoring
implement observability best practices and tooling to monitor the effectiveness of the delivery of application and infrastructure services
work closely with the Operational Resiliency team, develop and implement infrastructure disaster recovery protocols to minimize disruption to business operations in the event of emergency situations
develop and report on relevant KPIs and metrics to drive operational maturity, improved customer experience, and aid in transparency and understanding across the business of the infrastructure organization’s contributions
strong focus on leadership and development of team members and extended team members of managed service partners
ensure professional growth, setting direction/priorities, delegating tasks, resolving conflicts, and fostering a winning culture with high-performance-oriented team members

What we offer

Reasonable accommodation under the terms of the ADA and certain state or local laws

Senior Technology Resilience and Operations Leader

Senior Program Manager, Technology Resilience & Operations Leader responsible fo...

Location

United States , Iselin

Salary:

175000.00 - 230000.00 USD / Year

Citizens Bank

Expiration Date

Until further notice

Requirements

Ten or more years of experience in technology program management, operational resilience, technology risk, cloud engineering, or enterprise technology leadership
Demonstrated experience leading complex, cross functional enterprise programs with regulatory and operational impact
Strong knowledge of technology resilience testing, cloud architecture principles, and observability practices
Experience working with third party risk frameworks, regulatory expectations, and contract control requirements
Prior experience supporting or managing mission critical operational centers such as NOC, TOC, or SOC
Proven ability to influence and drive execution across matrixed organizations without direct authority
Strong communication, stakeholder management, and executive reporting skills

Job Responsibility

Lead the enterprise technology resilience program, including strategy, roadmap, execution cadence, and governance
Develop and maintain technology resilience testing frameworks aligned with regulatory, industry, and internal standards
Coordinate with engineering, infrastructure, and application teams to plan and execute resilience, failover, and chaos testing exercises
Establish centralized program oversight for critical asset mapping, scenario design, testing schedules, issue tracking, and remediation management
Define, track, and report resilience metrics, dashboards, test coverage, and issue aging to senior leadership and governance forums
Drive continuous improvement initiatives across disaster recovery, high availability, and fault tolerant design practices
Lead cloud governance and resilience guardrail initiatives in partnership with enterprise architecture, cloud engineering, and risk teams
Define minimum resilience design requirements for cloud native and hybrid solutions, including multi availability zone patterns, automated failover, observability, and dependency management
Program manage the integration of resilience controls into reference architectures, delivery pipelines, and automated policy enforcement
Develop and maintain standards, playbooks, and guidance to support consistent and resilient cloud adoption

What we offer

medical, dental, and vision coverage
retirement benefits
parental leave
flexible work arrangements
education reimbursement
wellness programs
paid time off

Fulltime

Senior Principal Technical Program Manager - Data Platform

You will shape a modern technical program management organization. The goal is t...

Location

United States , San Francisco

Salary:

191600.00 - 307800.00 USD / Year

Atlassian

Expiration Date

Until further notice

Requirements

12+ years experience working with software teams
4+ years recent experience leading platform and software teams in a similar Technical Program Management or Technical Product Management role
Experience building commercial Cloud Services / Platforms
Experience in designing and building back end software systems, including tradeoffs, launching and scaling
Experience leading strategy and execution on complex, cross divisional, technical programs, including analysing business priorities, customer needs, industry trends and articulating a long-term roadmap
Experience driving projects spanning multiple teams, including reaching agreements with your engineering partners and stakeholders, shepherding the projects while identifying and mitigating risks, making trade-off decisions optimising the outcome
Able to translate customer and/or product requirements into technical requirements

Job Responsibility

Shape a modern technical program management organization
Steer through growth and remove barriers in the journey towards long-term goals
Find agreement by creating guardrails and removing barriers to help teams accelerate
Build resilience into systems to ensure service and data availability for customers in the event of failures in system components
Define specific systems programs and create a plan of action for realizing those programs
Partner with and influence engineers and architects in making progress on problems
Take a systematic approach to engineering problems
Be accountable for the success of technical programs by managing the entire lifecycle from initiation to forecasting, budgeting, scheduling, etc.
Manage complex dependencies and projects with a broad scope across the company
Collaborate with functions across the company to create reliable and cost-effective disaster recovery solutions for all of Atlassian’s services

What we offer

Health coverage
Paid volunteer days
Wellness resources

Fulltime

Bcp engineer

Lead Infrastructure and Application Disaster Recovery testing and Data Center Po...

Location

Mexico , Guadalajara

Salary:

Not provided

NTT DATA

Expiration Date

Until further notice

Requirements

Bachelor’s degree
Minimum 4-5 years of experience in technology stack including infrastructure and application
Experience in Managing Resiliency testing for On-Prem Database, NAS, Object Storage, Block Storage etc.,
Understanding of disaster recovery procedures
Understanding of RTO, RPO and how these metrics are calculated
Knows differences between resiliency testing and cyber-attack recovery/Repave test.
Background in cyber-attack recovery
Background in disaster recovery.
Strong analytical, communication, interpersonal, problem solving, organizational and time management skills
Basic understanding of excel and the ability to manipulate data using excel Knowledge of basic excel formulas used in data manipulation

Job Responsibility

Lead Infrastructure and Application Disaster Recovery testing and Data Center Power-down events
Drive adoption of the mandated controls which are in place with application teams.
Provide guidance to application owners on how they can adapt a recovery procedure to adhere to the uplifted controls in place.
Disaster Recovery tests scope events to include the interdependencies of shared services, up-steam and downstream application dependencies, Order of recovery, etc.
Cyber Attack Recovery Testing Driving teams to become resilient and have the ability to recover during a cyber-attack, Test the cyber-attack recovery procedures.
Power-down events establish critical milestones, establish order of recovery, verify dependency of various infrastructure components
Coordinate and manage regulatory resiliency recovery tests, such as SIFMA's industry-wide exercises, SPOOR-related tests, and those guided by the Monetary Authority of Singapore (MAS), to ensure compliance with industry standards and regulatory requirements. This involves liaising with various internal & external teams, scheduling test activities, monitoring progress, and documenting outcomes to support robust audit and risk management processes
Identify gaps in process and procedures and enhance those processes.
Identify opportunities for automation
Oversee and manage the execution plans

Select Country

Cloud Resilience & Disaster Recovery Engineer

Randstad

Location:
Australia , Melbourne

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
April 22, 2026

Expiration:
April 22, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Cloud Resilience & Disaster Recovery Engineer

Lead Rubrik Backup Engineer IV

Systems Engineering Lead

Staff Security Engineer, Business Continuity & Disaster Recovery

Middleware Support Engineer

Director, North America Infrastructure Operations & Reliability

Senior Technology Resilience and Operations Leader

Senior Principal Technical Program Manager - Data Platform

Bcp engineer

Our AI answers in your language

Cloud Resilience & Disaster Recovery Engineer

Randstad

Location:Australia , Melbourne

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:April 22, 2026

Expiration:April 22, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Cloud Resilience & Disaster Recovery Engineer

Lead Rubrik Backup Engineer IV

Systems Engineering Lead

Staff Security Engineer, Business Continuity & Disaster Recovery

Middleware Support Engineer

Director, North America Infrastructure Operations & Reliability

Senior Technology Resilience and Operations Leader

Senior Principal Technical Program Manager - Data Platform

Bcp engineer

Location:
Australia , Melbourne

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
April 22, 2026

Expiration:
April 22, 2026