Lead Systems Operations Engineer Job at Wells Fargo (Chandler)

Job Description

About this role: We are seeking a highly skilled and forward‑thinking Lead Systems Operations Engineer to join our Technology Operations team. This role is ideal for someone who excels in Kubernetes and OpenShift platform operations, drives operational excellence, and leads initiatives that improve stability, automation, and service reliability. You will play a key role in operating and improving our cloud‑native platforms, reducing operational toil, and ensuring the resilience and compliance of critical infrastructure services.

Job Responsibility

Lead complex, broad impact initiatives including provision of high-level systems consultation for the technology teams
Platform Operations Leadership: Lead day‑to‑day Platform (REDIS, OpenShift) platform operations, including cluster maintenance, upgrades, performance monitoring, and troubleshooting
Operations Excellence – Improving operations practices to meet new Incident SLA and improving practices during incident & problem management
Incident Response & Problem Management: Serve as an operational lead during incidents, driving rapid diagnosis, resolution, root‑cause analysis, and long‑term corrective actions
Operational Automation: Develop or enhance automation (Python, Bash, GitOps workflows, or AI‑assisted tools), build AI Agents, MCP server and tools, add skill in MCP that eliminates manual effort and streamlines run processes
Platform Readiness: Lead Platform lifecycle activities, including new cluster builds, configuration, onboarding, upgrades, and cluster decommissioning, ensuring consistency, reliability, and compliance across environments
Collaboration & Enablement: Partner with engineering, SRE, security, and development teams to implement repeatable operational patterns, guardrails, and platform readiness standards
Security, Compliance & Governance: Ensure platform operations follow organizational policies, security standards, audit controls, and regulatory requirements
Continuous Improvement: Identify operational gaps, recurring issues, or inefficiencies and lead initiatives to enhance reliability, resiliency, and operational maturity

Requirements

5+ years of Systems Engineering, equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
5+ years of hands-on experience in Python for platform operations automation
5+ years of designing and building complex observability solutions leveraging industry standard toolset and or custom-built solutions
Strong proficiency in writing production-quality Python code by using Python libraries and client integrations
Ability to develop automation solutions, including remediation procedures & workflows and operational tools using Python
3+ years of experience managing complex, enterprise-scale applications in production environments
Extensive experience with configuration and monitoring tools such as Grafana, Splunk, and Prometheus
Deep platform expertise, including cluster build-outs, CI/CD pipeline integration, troubleshooting, debugging, remediation, patching, upgrades, and root cause analysis (RCA)
2+ years of hands-on Linux system administration experience
Deep expertise with Platforms includes building clusters with pipelines, diagnosing, debugging, remediation, upgrades, patching, and RCA

Nice to have

Strong experience with Open shift, Kubernetes, Public cloud
Experience in AI development, including agents, MCP, tools, and related frameworks
Hands-on experience with operational tooling such as Grafana, Splunk, Prometheus, Jira, or GitHub, SDLC
Demonstrated ability to influence operational improvements across teams
Strong analytical and operational problem‑solving skills

Wells Fargo - All Job Offers

Select Country

Lead Systems Operations Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Lead Systems Operations Engineer

Lead Systems Engineer (Model-Based Systems Engineering)

Operations Domain Systems Engineer

Senior / Lead Linux Systems Engineer

Lead Go To Market Systems Engineer

HPC & AI Systems Engineer for Integrated Systems Test

Senior Mission Operations Team Lead

Principal Manufacturing Systems Engineer - Amgen Dun Laoghaire Project Delivery Lead

Systems Support Engineer

Our AI answers in your language