This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a detail-oriented and proactive Cloud Operations Support Engineer to join our growing Cloud Infrastructure team. The public cloud operational support engineer is responsible for ensuring the stability performance and availability of cloud-based infrastructure and services across platforms such as AWS and GCP. This role supports day-to-day operations including incident response monitoring access management provisioning and troubleshooting of cloud resources. The engineer will work closely with application teams your team, and cloud architecture to enforce operational best practices, implement automation, and drive service reliability.
Job Responsibility:
Monitor AWS/GCP infrastructure and services to ensure availability, performance and reliability
Lead Incident management, including triage, impact assessment and coordination with engineering teams to resolve issues
Participate in on-call rotation for high severity/major incidents support coverage
Collaborate with stakeholders to resolve chronic issues, reduce toil and lead Root Cause Analysis (RCA) post restoration of service
Design testing approaches, complex processes, reporting streams, and assist with the automation of repetitive tasks
Provide technical/strategic direction to team members
Create, Maintain and enhance operational runbooks, SOPs and knowledge base articles
Support provisioning and configuration of Cloud resources across multiple environments
Implement and maintain monitoring, logging and alerting tools (ex: CloudWatch, Stackdriver, Prometheus etc)
Ensure ongoing compliance with regulatory requirements
Has the ability to operate with a limited level of direct supervision
Acts as SME to senior stakeholders and/or other team members
Collaborate with Product, engineering, security and other stakeholders and lead value adding outcomes
Appropriately assess risk when business decisions are made
Requirements:
8-10 plus years of experience in roles centered around infrastructure delivery
Experience in Cloud Operations/support and site reliability
Hands on experience with AWS and/or GCP
Proficiency with Infrastructure as code (IaC) tools like Terraform, CloudFormation
Working knowledge of scripting (bash, Python or similar)
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.