This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a detail-oriented and proactive Cloud Operations Senior Support Engineer to join our growing Cloud Infrastructure team. The role supports day-to-day operations including incident response, monitoring, access management, provisioning and troubleshooting of cloud resources, and works closely with application teams and cloud architecture to enforce operational best practices, implement automation and drive service reliability.
Job Responsibility:
lead incident management, including triage, impact assessment, coordination with engineering teams to resolve issues and participate in on-call rotation for high severity / major incidents support coverage
collaborate with stakeholders to resolve chronic issues, reduce toil and lead Root Cause Analysis (RCA) post restoration of service
drive process improvements in areas like automation of repetitive tasks, tech debt reduction, testing approaches, complex operational processes, reporting streams, and Operational readiness
create, maintain and enhance operational runbooks, SOPs and knowledge base articles
support provisioning and configuration of Cloud resources across multiple environments including looking for opportunities for improvement
implement and maintain monitoring, logging and alerting tools (ex: CloudWatch, Stackdriver, Prometheus etc)
monitor AWS/GCP infrastructure and services to ensure availability, performance and reliability
operate with a limited level of direct supervision
act as SME to senior stakeholders and/or other team members
collaborate with Product, engineering, security and other stakeholders and lead value adding outcomes
Requirements:
8-10 plus years of experience in roles centered around infrastructure delivery (application hosting and/or end user services) with a proven track record of operational process change and improvement
experience in Cloud Operations/ support and site reliability
hands on experience with AWS and/ or GCP
proficiency with Infrastructure as code (IaC) tools like Terraform, CloudFormation
working knowledge of scripting (bash, Python or similar)
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.