CrawlJobs Logo

Lead Software Engineer – Microservices, Observability & AI Platforms

United States, Phoenix · Job Posted May 29, 2026
Apply Position
Job Link Share

Job Description

Wells Fargo is seeking an experienced Lead Software Engineer within Technology Engineering to design, develop, and lead scalable backend applications and platform services. This role requires deep expertise in modern backend technologies, strong architectural skills, and the ability to mentor engineers while partnering closely with product and distributed engineering teams.

Job Responsibility

  • Lead complex technology initiatives including those that are companywide with broad impact
  • Act as a key participant in developing standards and companywide best practices for engineering complex and large-scale backend and platform solutions
  • Design, code, test, debug, and document backend systems, APIs, and platform services for projects and programs
  • Design and implement scalable architecture patterns including microservices, API-first, and distributed systems
  • Design and develop scalable telemetry, observability, and analytics solutions to support real-time operational visibility and decision-making
  • Develop and optimize Splunk dashboards and analytics to enable real-time insights, advanced alerting, and historical analysis
  • Develop Splunk analytics that power real-time operational insights, advanced alerting, historical analysis, and AI/ML model inputs
  • Design and build Beacon / Telemetry APIs to capture application, platform, and business signals
  • Build and maintain telemetry ingestion services and pipelines that collect, normalize, store, and enrich data for analytics and AI/ML use cases
  • Contribute to leveraging AI/ML techniques for operational intelligence, anomaly detection, and automation opportunities
  • Review and analyze complex, large-scale technology solutions for tactical and strategic business objectives and technical challenges that require in-depth evaluation
  • Make decisions in developing companywide best practices for engineering and technology solutions, influencing teams to meet deliverables and drive new initiatives
  • Collaborate and consult with key technical experts, senior technology team, and external industry groups to resolve complex technical issues and achieve goals
  • Lead projects, teams, or serve as a peer mentor

Requirements

  • 5+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
  • 4+ years of experience in Java (Spring Boot / Microservices) or Python (APIs, backend services)
  • 4+ years of experience designing and building microservices and distributed systems
  • 3+ years of experience in API design and development (RESTful services)
  • 3+ years of system design and architecture experience for scalable, high-performance applications
  • 3+ years of experience working with cloud-native or containerized platforms (Kubernetes, OpenShift, or similar)
  • 2+ years of experience with CI/CD pipelines, DevOps practices, and automated testing frameworks
  • Strong understanding of data structures, algorithms, and design patterns
  • Experience building and optimizing Splunk dashboards using advanced SPL techniques including multi-stage pipelines, statistical functions, joins, lookups, and data enrichment
  • Experience developing Splunk analytics for real-time insights, alerting, and historical analysis
  • Experience designing and developing Telemetry / Beacon APIs for capturing application and business signals
  • Experience building and maintaining telemetry ingestion pipelines and services
  • Strong understanding of telemetry standards and concepts (logs, metrics, traces, events)

Nice to have

  • Experience with Observability / Monitoring tools such as Splunk, AppDynamics, Dynatrace, Datadog, or similar
  • Exposure to AI / ML / GenAI / Agentic AI use cases
  • Experience with event-driven architecture or streaming platforms (Kafka or similar)
  • Experience building or supporting enterprise-scale, highly available applications
  • Excellent problem-solving and analytical skills
  • Strong communication and stakeholder-management abilities
  • Ability to lead by influence and example in a fast-paced environment
  • Knowledge of API integrations, distributed systems, and backend performance optimization
  • Familiarity with modern development tools, frameworks, and engineering best practices

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Lead Software Engineer – Microservices, Observability & AI Platforms

8 matching positions

Senior Software Engineer, Backend (Voice Platform)

At Cresta, the Voice Platform team is on a mission to transform real-time voice ...
Location
Location
Romania , Bucharest
Salary
Salary:
Not provided
cresta.com Logo
Cresta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science or related field
  • 5+ years of experience in backend system development, distributed systems, or cloud infrastructure
  • Expertise in Go (or a similar systems language) with strong API and service design skills
  • Proven experience with scalable architectures using microservices, workflow orchestration, distributed caching, and cloud databases
  • Strong knowledge of Kubernetes, Docker, and modern cloud infrastructure
  • Solid understanding of networking, real-time communication protocols, and cloud security best practices
  • Demonstrated ability to lead complex technical projects from design through production
Job Responsibility
Job Responsibility
  • Lead the design and development of scalable, distributed backend microservices in Golang (with some Python for AI-related services)
  • Own and evolve voice platform integrations with large-scale enterprise communication and contact center systems
  • Drive initiatives to expand platform capabilities, including bi-directional SIP, WebRTC integrations, multilingual support, advanced transcription, and real-time translation
  • Build systems that power conversation intelligence for both remote and in-person interactions
  • Improve observability, reliability, and self-service troubleshooting across the platform
  • Ensure performance, scalability, and resilience of real-time voice pipelines running in the cloud
  • Collaborate with cross-functional teams (ML, product, solution architects) to design end-to-end solutions for customer deployments
  • Provide technical guidance, mentorship, and best practices to other engineers on the team
What we offer
What we offer
  • Compensation for this position includes a base salary, equity, and a variety of benefits
Read More
Arrow Right

AI Engineer

Reporting to the AI & Technology Oversight Manager, the AI Engineer is responsib...
Location
Location
India , Mumbai
Salary
Salary:
Not provided
waystone.com Logo
Waystone Governance Ltd.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep understanding of the distinction between Generative AI and Agentic AI, including their foundations, capabilities, and appropriate use cases
  • Strong understanding of AI, ML and LLM concepts, including prompt engineering, prompt grounding, iterative loop techniques, context windows, embeddings, RAG, agentic workflows
  • Proven ability to integrate AI capabilities both into low-code automation flows and high-code stacks, including, applications, APIs, microservices, distributed systems, and development or testing tools
  • Solid software development background with hands-on coding experience in one or more engineering ecosystem such as .NET (C#), Python, or TypeScript
  • Excellent communication skills, with the ability to translate complex AI concepts for non‑experts and to effectively influence and collaborate with stakeholders at all levels, both technical and non‑technical
  • Strong writing skills, with the ability to contribute to AI literacy and AI fluency documentation
  • Strong understanding of responsible AI principles, including governance, bias mitigation, compliance, and risk-based decision-making
  • Analytical thinking with excellent problem‑solving ability and keen attention to details
  • Ability to mentor developers and testers, and to drive innovation across engineering, QA, and architecture
  • Ability to assess AI‑enabled capabilities in third‑party SaaS platforms (e.g., Appian, Salesforce,etc) and provide guidance on responsible, effective adoption
Job Responsibility
Job Responsibility
  • Hands-on contributor to the design and development of AI-enabled solutions, capable of writing both production-quality code and rapid experimental prototypes
  • Develop and implement AI‑enabled microservices, APIs, applications, and internal tools
  • Integrate AI capabilities following secure, scalable engineering best practices
  • Design, build and validate AI‑driven solutions leveraging providers such as OpenAI and Anthropic
  • Enhance low‑code/no‑code automation platforms (e.g., Power Automate, n8n, Workato) by embedding intelligent processing and applying agentic patterns where relevant
  • Implement Model Context Protocol (MCP) servers for secure AI‑to‑system connectivity
  • Lead AI‑based document parsing and intelligent data extraction initiatives
  • Contribute to educating and enabling Enterprise Capabilities areas, including Integration and Automation, by providing guidance, training, and best practices, e.g., on effective use of n8n agents
  • Engage with business stakeholders to understand requirements, constraints, and key drivers, identifying and implementing high‑value AI opportunities across Waystone
  • Prototype AI features and iterate towards production‑ready capabilities
  • Fulltime
Read More
Arrow Right

AI Engineer

Reporting to the AI & Technology Oversight Manager, the AI Engineer is responsib...
Location
Location
United Kingdom , Leeds
Salary
Salary:
Not provided
waystone.com Logo
Waystone Governance Ltd.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep understanding of the distinction between Generative AI and Agentic AI, including their foundations, capabilities, and appropriate use cases
  • Strong understanding of AI, ML and LLM concepts, including prompt engineering, prompt grounding, iterative loop techniques, context windows, embeddings, RAG, agentic workflows
  • Proven ability to integrate AI capabilities both into low-code automation flows and high-code stacks, including, applications, APIs, microservices, distributed systems, and development or testing tools
  • Solid software development background with hands-on coding experience in one or more engineering ecosystem such as .NET (C#), Python, or TypeScript
  • Excellent communication skills, with the ability to translate complex AI concepts for non‑experts and to effectively influence and collaborate with stakeholders at all levels, both technical and non‑technical
  • Strong writing skills, with the ability to contribute to AI literacy and AI fluency documentation
  • Strong understanding of responsible AI principles, including governance, bias mitigation, compliance, and risk-based decision-making
  • Analytical thinking with excellent problem‑solving ability and keen attention to details
  • Ability to mentor developers and testers, and to drive innovation across engineering, QA, and architecture
  • Ability to assess AI‑enabled capabilities in third‑party SaaS platforms (e.g., Appian, Salesforce,etc) and provide guidance on responsible, effective adoption
Job Responsibility
Job Responsibility
  • Hands-on contributor to the design and development of AI-enabled solutions, capable of writing both production-quality code and rapid experimental prototypes
  • Develop and implement AI‑enabled microservices, APIs, applications, and internal tools
  • Integrate AI capabilities following secure, scalable engineering best practices
  • Design, build and validate AI‑driven solutions leveraging providers such as OpenAI and Anthropic
  • Enhance low‑code/no‑code automation platforms (e.g., Power Automate, n8n, Workato) by embedding intelligent processing and applying agentic patterns where relevant
  • Implement Model Context Protocol (MCP) servers for secure AI‑to‑system connectivity
  • Lead AI‑based document parsing and intelligent data extraction initiatives
  • Contribute to educating and enabling Enterprise Capabilities areas, including Integration and Automation, by providing guidance, training, and best practices, e.g., on effective use of n8n agents
  • Engage with business stakeholders to understand requirements, constraints, and key drivers, identifying and implementing high‑value AI opportunities across Waystone
  • Prototype AI features and iterate towards production‑ready capabilities
  • Fulltime
Read More
Arrow Right

Staff Infrastructure Software Engineer - AI Platform

We are currently seeking a Staff Software Engineer to join the AI Platform team ...
Location
Location
United Kingdom , Edinburgh
Salary
Salary:
Not provided
addepar.com Logo
Addepar
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Extensive experience as a Software/Backend Engineer, with a track record of taking on increasing responsibility
  • Experience across the full product lifecycle: designing, implementing, shipping, scaling, operationalizing, and maintaining technology/SaaS products
  • Exceptional Programming skills and fundamentals in Python/Go/Java, with a proven track record of building large scale production systems
  • Proficient experience with diverse compute environments including microservices (K8s), Databricks and serverless architectures (e.g. AWS Lambda)
  • Demonstrable experience leading initiatives with infrastructure-as-code tools such as Terraform in complex, multi-account environments
  • Proficient experience with comprehensive monitoring and alerting stacks (e.g. Prometheus/Grafana/Sentry/cloud-native tools), with a focus on observability strategy
  • Excellent interpersonal and communication skills to effectively collaborate with multi-functional teams, articulate complex technical concepts, and influence outcomes
Job Responsibility
Job Responsibility
  • Design and build the production runtime for LLM-based agents and products, creating the services and infrastructure that serve autonomous agents
  • Develop deep application-level knowledge to proactively inform and influence requirements, constraints and best practices for implementing composable, complex AI systems
  • Lead the design, implementation, and automation of production infrastructure on a variety of cloud environments (Kubernetes/Databricks), to enable us to ship and scale AI features instantly
  • Evangelize and promote disciplined, best engineering practices to enforce strong production hygiene and culture
  • Initiate and lead collaborations with cross-functional teams to identify and resolve complex application or infrastructure issues, serving as a technical subject matter expert
  • Architect, build, and maintain advanced, automated CI/CD pipelines e.g. using Jenkins, ArgoCD, AWS CodeBuild/Pipeline, GitHub Actions, or similar, establishing best practices for deployment strategies (e.g., blue/green, canary)
  • Develop systems and best practices monitoring, alerting, and troubleshooting of our probabilistic and AI-driven systems and broader software stack
Read More
Arrow Right

Lead Software Engineer - AI Engineering

Join RTB House and lead our AI Engineering Lab, a team dedicated to pioneering i...
Location
Location
Poland
Salary
Salary:
Not provided
rtbhouse.com Logo
RTB House
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of 6 years of professional experience in Software Engineering, with a strong background in building and deploying complex, large-scale systems
  • Distributed Systems Expertise: Proven, hands-on experience designing, developing, and operating distributed systems at scale (e.g., microservices, event-driven architectures, stream processing)
  • Programming Languages: Proficiency in at least two programming languages, with Python being mandatory
  • AI/ML Engineering: Basic understanding of the Machine Learning lifecycle, MLOps practices, and experience in integrating ML models (especially LLMs) into production applications
  • Technical Leadership: Demonstrated experience in technical leadership, including defining technical roadmaps, mentoring junior engineers, leading code reviews, and driving architectural decisions
  • Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
Job Responsibility
Job Responsibility
  • Lead, mentor, and grow a team of talented Frontend/Full Stack and Backend engineers, fostering a culture of technical excellence and high code quality
  • Serve as a Full Stack tech-leader (often hands-on), contributing to the design and development of key architectures and full stack solutions that support various platforms (Web, Mobile, CTV)
  • Define and execute the team's charter, focusing on end-to-end customer interactions and the reliable display of ads globally
  • Develop and oversee state-of-the-art observability systems for the Ad Display platform, tracking crucial metrics like reliability, viewability, latency, and providing deep debugging insights for ad creation teams
  • Provide governance for cross-team ad rollout, including defining best practices and tooling for rigorous testing and deployment strategies (A/B testing, Canary deployments)
  • Lead complex technical projects at massive scale, ensuring our solutions can handle millions of requests and maintain high performance worldwide
  • Collaborate intensely with a Staff Frontend Engineer, stakeholders from Ads layouts creation teams (designers, graphic specialists), and the core Bidding Platform backend teams.
What we offer
What we offer
  • Projects focused on high code quality – solid code reviews are our standard
  • Collaboration within an interdisciplinary, self-sufficient team including: DevOps (ensuring a great Developer Experience), database experts, backend developers, product designers, and QA engineers
  • Hardware and software tailored to your preferences – e.g. MacBook, AI tool licenses
  • Access to modern technologies and the opportunity to apply them in large-scale, high-impact projects
  • Flexible working conditions – no core hours, fully remote cooperation.
Read More
Arrow Right

Lead Software Engineer - AI Engineering

Join RTB House and lead our AI Engineering Lab, a team dedicated to pioneering i...
Location
Location
Poland
Salary
Salary:
Not provided
rtbhouse.com Logo
RTB House
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of 6 years of professional experience in Software Engineering, with a strong background in building and deploying complex, large-scale systems
  • Distributed Systems Expertise: Proven, hands-on experience designing, developing, and operating distributed systems at scale (e.g., microservices, event-driven architectures, stream processing)
  • Programming Languages: Proficiency in at least two programming languages, with Python being mandatory
  • AI/ML Engineering: Basic understanding of the Machine Learning lifecycle, MLOps practices, and experience in integrating ML models (especially LLMs) into production applications
  • Technical Leadership: Demonstrated experience in technical leadership, including defining technical roadmaps, mentoring junior engineers, leading code reviews, and driving architectural decisions
  • Education: Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field.
Job Responsibility
Job Responsibility
  • Lead, mentor, and grow a team of talented Frontend/Full Stack and Backend engineers, fostering a culture of technical excellence and high code quality
  • Serve as a Full Stack tech-leader (often hands-on), contributing to the design and development of key architectures and full stack solutions that support various platforms (Web, Mobile, CTV)
  • Define and execute the team's charter, focusing on end-to-end customer interactions and the reliable display of ads globally
  • Develop and oversee state-of-the-art observability systems for the Ad Display platform, tracking crucial metrics like reliability, viewability, latency, and providing deep debugging insights for ad creation teams
  • Provide governance for cross-team ad rollout, including defining best practices and tooling for rigorous testing and deployment strategies (A/B testing, Canary deployments)
  • Lead complex technical projects at massive scale, ensuring our solutions can handle millions of requests and maintain high performance worldwide
  • Collaborate intensely with a Staff Frontend Engineer, stakeholders from Ads layouts creation teams (designers, graphic specialists), and the core Bidding Platform backend teams.
What we offer
What we offer
  • Projects focused on high code quality – solid code reviews are our standard
  • Collaboration within an interdisciplinary, self-sufficient team including: DevOps (ensuring a great Developer Experience), database experts, backend developers, product designers, and QA engineers
  • Hardware and software tailored to your preferences – e.g. MacBook, AI tool licenses
  • Access to modern technologies and the opportunity to apply them in large-scale, high-impact projects
  • Flexible working conditions – no core hours, fully remote cooperation.
Read More
Arrow Right

Executive Director, Digital Engineering- Aetna Member Services

The Executive Director, Digital Engineering- Aetna Member Services is a senior t...
Location
Location
United States , Work at Home
Salary
Salary:
175100.00 - 334750.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
May 31, 2026
Flip Icon
Requirements
Requirements
  • 15+ years of software engineering experience with deep expertise in backend systems, distributed services, and API platforms
  • Proven experience leading large engineering organizations delivering mission‑critical services
  • Strong background in AWS cloud platform, microservices architecture, CI/CD pipelines, and DevOps/SRE practices
  • Demonstrated success driving stability, resiliency, and observability improvements at scale
  • Experience leveraging AI, ML, or LLM-based engineering and operational tooling
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Lead the design, development, and delivery of scalable backend systems, APIs, and microservices powering member-facing capabilities
  • Define API contract standards, and integration patterns used across Member Services platforms
  • Drive service modernization by adopting cloud‑native architectures, containerization, service mesh, and event-driven patterns
  • Establish standards for availability, resiliency, performance, and disaster recovery across all services
  • Implement SLO/SLI/error budget frameworks, health checks, and high‑availability architectures
  • Institutionalize strong observability practices using metrics, logs, traces, and distributed monitoring
  • Drive continuous reliability improvements through chaos engineering, automated fault injection, and proactive root‑cause analysis
  • Integrate AI and LLM-based tooling into software development, QA, and operational processes
  • Promote AIOps capabilities to reduce manual toil and amplify engineering productivity
  • Introduce AI-enhanced workflows across Member Services to improve personalization, routing, and intelligent decisioning
What we offer
What we offer
  • Affordable medical plan options
  • 401(k) plan (including matching company contributions)
  • Employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
  • Paid time off
  • Flexible work schedules
  • Family leave
  • Dependent care resources
  • Colleague assistance programs
  • Tuition assistance
  • Fulltime
!
Read More
Arrow Right

Gen AI Technical Lead

We are seeking an experienced AI Technical Lead to design, build, and scale AI s...
Location
Location
United States , Frederick
Salary
Salary:
150000.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of software engineering experience
  • 3+ years in applied AI/ML engineering
  • Strong proficiency in Python, AI frameworks (PyTorch, TensorFlow), and API-driven development
  • Hands-on expertise with GenAI, LLM orchestration frameworks, RAG pipelines, and vector databases (Pinecone, FAISS, Azure AI Search)
  • Experience developing Agentic AI using tools/functions integration, planning agents, workflows, or multi agent systems
  • Deep understanding of cloud platforms: Preferably Azure, including ML Ops and containerization (Docker, Kubernetes)
  • Solid understanding of system design, distributed computing, microservices, and API architecture
Job Responsibility
Job Responsibility
  • Lead the design and development of Generative AI solutions using LLMs, multimodal models, and diffusion models
  • Build and maintain Agentic AI systems, including autonomous agents, tools integration, reasoning frameworks, and multi-agent workflows
  • Develop production-grade AI components using strong engineering principles: modularity, maintainability, observability, and testing
  • Optimize models for performance, latency, and cost using techniques like quantization, distillation, and retrieval augmentation
  • Design scalable, secure, and compliant AI solutions aligned with enterprise engineering standards
  • Ensure adherence to best practices in software development, version control, CI/CD, containerization, and LLMOps/MLOps
  • Lead development of applications using LLMs (OpenAI, Azure OpenAI, Anthropic, Llama, etc.)
  • Build RAG pipelines, vector-based retrieval systems, and knowledge grounding solutions
  • Create autonomous and semi-autonomous AI agents capable of task execution, planning, and reasoning
  • Drive experimentation with new GenAI techniques—prompt engineering, tool use, function calling, fine-tuning, and model fine-tuning
  • Fulltime
Read More
Arrow Right