CrawlJobs Logo

Platform Engineering, Monitoring and Observability Lead – SRE Focus

https://www.wellsfargo.com/ Logo

Wells Fargo

Location Icon

Location:
India, Bengaluru

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Lead Systems Operations Engineer role focused on platform engineering, monitoring and observability with SRE focus. The position involves leading complex initiatives, providing high-level systems consultation, and working as a key participant in large-scale planning of computer systems and network infrastructure.

Job Responsibility:

  • Lead complex, broad impact initiatives including provision of high level systems consultation for the technology teams
  • Work as key participant in large scale planning of computer systems and network infrastructure for Systems Operations functional area
  • Review and analyze complex technical challenges, as well as escalated support issues related to core business solutions
  • Make decisions on technical changes and enhancements
  • Consult with engineering team on change design requiring solid understanding of technical process controls or standards
  • Collaborate and consult with technical peers, colleagues, and mid to more experienced level managers to resolve systems support issues and achieve goals
  • Operate on a 16x5 schedule
  • Participate in weekend on-call rotations
  • Provide support for high-priority incidents and system health checks
  • Be available during off-hours when necessary to support major incidents, deployments, or escalations

Requirements:

  • 5+ years of Systems Engineering, Technology Architecture experience, or equivalent demonstrated through work experience, training, military experience, education
  • Lead the strategy and execution of monitoring and observability initiatives across infrastructure and applications
  • Architect and maintain dashboards, alerts, and telemetry pipelines using tools like Grafana, Prometheus, and Elastic APM
  • Integrate and optimize observability platforms including Splunk, AppDynamics, ThousandEyes, and ITRS Geneos
  • Collaborate with SRE and DevOps teams to ensure system reliability, scalability, and performance
  • Develop automation scripts in Python and Shell for data collection, analysis, and alerting
  • Drive root cause analysis and incident response using observability data
  • Evaluate and implement Gen AI solutions to enhance observability and predictive analytics
  • Mentor junior engineers and promote best practices in monitoring and reliability engineering
  • Bachelor's or Master's degree in Computer Science, Engineering, or related field
  • 5+ years of experience in IT operations, with at least 3 years in a lead role focused on observability and SRE
  • Proven expertise in tools: Splunk, ITRS Geneos, Grafana, Prometheus, Elastic APM, ThousandEyes, AppDynamics
  • Strong scripting skills in Python and Shell
  • Deep understanding of SRE principles including SLIs, SLOs, error budgets, and incident management
  • Experience with cloud platforms (AWS, Azure, or GCP) and containerized environments (Kubernetes, Docker)
  • Certifications in observability tools or cloud platforms (e.g., Splunk Certified Admin, AWS Cloud Practitioner)
  • Experience with machine learning or Gen AI frameworks applied to observability
  • Familiarity with CI/CD pipelines and infrastructure as code (Terraform, Ansible)
  • Strong analytical mindset with a passion for data-driven decision-making
  • Excellent communication and stakeholder management skills

Nice to have:

  • Certifications in observability tools or cloud platforms
  • Experience with machine learning or Gen AI frameworks applied to observability
  • Familiarity with CI/CD pipelines and infrastructure as code (Terraform, Ansible)

Additional Information:

Job Posted:
September 08, 2025

Expiration:
September 11, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.