CrawlJobs Logo

Site Reliability Engineering Manager

https://www.hpe.com/ Logo

Hewlett Packard Enterprise

Location Icon

Location:
India, Bangalore

Category Icon
Category:
IT - Administration

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

Not provided

Job Description:

Hewlett Packard Enterprise (HPE) is looking for a Site Reliability Engineering Manager to lead and mentor SRE teams, own the reliability strategy for SASE cloud infrastructure systems, and partner with other teams to deliver scalable cloud-native services. The role involves improving automation, observability, and operational efficiency while driving best practices and a strong engineering culture.

Job Responsibility:

  • Lead and mentor a team of Site Reliability Engineers, supporting their growth, performance, and well-being
  • Own the reliability strategy for SASE cloud infrastructure systems, including incident management, SLIs/SLOs, and capacity planning
  • Partner with Engineering, Product, and Security teams to design and deliver highly available, scalable, and resilient cloud-native services
  • Guide the team in building automation, improving observability, and improve operational efficiency of our cloud infrastructure
  • Drive adoption of best practices in monitoring, alerting, on-call operations, and runbook development
  • Build and maintain a strong engineering culture based on ownership, collaboration, and continuous learning
  • Define and track key reliability metrics, and report on team performance and system health to leadership
  • Contribute to hiring, onboarding, and career development for SREs.

Requirements:

  • 7–10 years of experience in Site Reliability Engineering, DevOps, or Cloud Infrastructure roles
  • Minimum 2 years of experience managing or leading cloud operations teams
  • Deep understanding of cloud platforms (AWS, GCP, or Azure) and cloud-native architectures
  • Hands-on experience with Kubernetes, containers, infrastructure as code (e.g., Terraform), and configuration management tools
  • Strong foundation in observability (monitoring, logging, tracing), automation using Python, and incident response
  • Familiarity with modern CI/CD automation and tools
  • Excellent communication, stakeholder management, and team-building skills
  • Experience scaling SRE practices in high-growth or large-scale environments
  • Ability to balance long-term reliability initiatives with short-term delivery needs.
What we offer:
  • Health & Wellbeing benefits for physical, financial, and emotional wellbeing
  • Personal & Professional Development programs
  • Unconditional inclusion in the workplace.

Additional Information:

Job Posted:
May 30, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.