This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Wells Fargo is seeking an experienced Lead Software Engineer to lead the design, development, and evolution of our enterprise observability and telemetry platforms. This role is critical to driving modern logging, monitoring, and AI‑driven insights using Splunk, Cribl, IT Service Intelligence (ITSI), and agentic AI systems. You will play a hands‑on technical leadership role—owning architecture decisions, driving automation, mentoring engineers, and partnering with platform, SRE, and security teams to deliver scalable, resilient, and intelligent observability solutions across hybrid and cloud environments.
Job Responsibility:
Own the architecture, reliability, and scalability of enterprise logging platforms
Lead design and implementation of high‑volume, resilient log ingestion pipelines across hybrid and cloud environments
Define and enforce logging standards, schemas, and governance aligned with enterprise observability strategy
Design and integrate AI/ML models for anomaly detection, log classification, predictive alerting, and signal enrichment
Build and operationalize agentic AI systems capable of: Autonomous log analysis and root‑cause hypothesis generation, Context‑aware remediation recommendations, Intelligent correlation across logs, metrics, and traces
Partner with platform and SRE teams to embed AI‑driven insights into incident response workflows
Partner with security and compliance teams to support auditability, retention, and access controls
Requirements:
5+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
5+ years of experience in software engineering or platform engineering
Deep expertise with Splunk Enterprise / Splunk Cloud, including search optimisation, index design, and platform operations
Proven hands‑on experience with Cribl (Stream / Edge) for telemetry pipeline optimisation
Strong experience with Splunk ITSI, service modelling, KPIs, and episode management
Hands‑on experience with agentic AI frameworks or autonomous agents (LLM‑based or rule‑driven)
Strong understanding of prompt engineering, tool‑using agents, feedback loops, and guardrails
Proficiency in one or more languages: Python, Java, Go, or Scala
Experience with cloud platforms, containerization, and Kubernetes/OpenShift
Familiarity with Open Telemetry, observability standards, and telemetry correlation
Worked on a large Splunk infrastructure, including clustered environments, multi-site deployments, and cloud/SAAS deployment
Exposure to containerization and orchestration tools (Docker, Kubernetes)
Familiarity with DevOps practices and CI/CD pipelines
Certifications in Splunk, Cribl, or cloud technologies (AWS, Azure)
Experience applying AI/ML techniques to operational or telemetry data