CrawlJobs Logo

Lead Systems Operations Engineer

https://www.wellsfargo.com/ Logo

Wells Fargo

Location Icon

Location:
United States, West Des Moines

Category Icon
Category:
IT - Administration

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

119000.00 - 206000.00 USD / Year

Job Description:

Wells Fargo is seeking a highly skilled and forward-thinking Lead Systems Operations Engineer to join our API SRE & Operations team within CTO Platform Services team. This role is ideal for someone passionate about building scalable, resilient, and intelligent infrastructure solutions. You will play a key role in driving automation, reducing operational toil, and enabling self-service capabilities through cutting-edge technologies including Generative AI and Agent development.

Job Responsibility:

  • Lead complex, broad impact initiatives including provision of high-level systems consultation for the technology teams
  • Work as key participant in large scale planning of computer systems and network infrastructure for Systems Operations functional area
  • Review and analyze complex technical challenges, as well as escalated support issues related to core business solutions that require in depth evaluation of multiple factors, such as alternatives, enhancements, periodic systems reviews, or improvements to existing systems
  • Make decisions on technical changes and enhancements
  • Consult with engineering team on change design requiring solid understanding of technical process controls or standards that influence and drive new initiatives
  • Collaborate and consult with technical peers, colleagues, and mid to more experienced level managers to resolve systems support issues and achieve goals
  • Production support activities: Incident Management: Triage incidents, engage partner teams, provide status updates, facilitate business user communication
  • Problem Management: Ticket management for daily tasks and efforts that are brought to support attention, Root cause analysis
  • Batch Management: Facilitate batch job creation, implementation, and change, Update batch schedules, Batch job documentation
  • Change Management: Identify forward schedule of change to applications and environments, Review post change implementation success / failures and create actions plans to remediate if required
  • Monitoring: Implementation of Alerts and Configuration - Customize alerting tools based on application specific thresholds, Enable business transaction monitoring
  • BCP Support: Documentation and coordination efforts to secure application resiliency prior to BCP event, Test execution during scheduled BCP events
  • Capacity Management: Support capacity planning initiatives and provide application information to capacity planning teams
  • Audit and Compliance support: Participate in audit activities and provide data to auditors on production environment variables
  • Automation: Configure dashboards and develop scripts to automate day to day tasks from platform perspective
  • On-call: Provide support during deployments and carry pager to support after hours

Requirements:

  • 5+ years of Systems Engineering, Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 4+ years of Proficiency in leveraging observability platforms such as BigPanda, ThousandEyes, Grafana, Prometheus, ELK, Splunk Observability, and AppDynamics to enhance service reliability and performance monitoring
  • 4+ years of experience in IT Service Management (ITSM), with a strong background in incident, problem, and change management processes
  • 3+ years of experience working with Red Hat Enterprise Linux and Kubernetes, with a strong focus on Red Hat OpenShift Container Platform (OCP)
  • 3+ years of experience with Site Reliability Engineering and supporting production grade
  • 3+ years of experience with solid understanding of Apigee or similar API Management platforms
  • 3+ years of experience with cloud-native architectures, high-availability systems, Cloud & Container Technologies like GCP or Azure and familiarity with Kubernetes
  • 3+ years of experience with Automation & Scripting: Expertise in Ansible Tower, including developing and maintaining playbooks

Nice to have:

  • Strong experience working in Agile methodologies / Scrum environments
  • Experience in project management and stakeholder engagement
  • Proven experience in leading cross-functional teams
  • Strong problem-solving and decision-making abilities
  • Excellent communication and collaboration skills
What we offer:
  • Health benefits
  • 401(k) Plan
  • Paid time off
  • Disability benefits
  • Life insurance, critical illness insurance, and accident insurance
  • Parental leave
  • Critical caregiving leave
  • Discounts and savings
  • Commuter benefits
  • Tuition reimbursement
  • Scholarships for dependent children
  • Adoption reimbursement

Additional Information:

Job Posted:
October 05, 2025

Expiration:
October 13, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.