CrawlJobs Logo

Lead Observability Platform Engineer

United States, Plano 149800.00 - 188100.00 USD / Year · Job Posted March 22, 2026
Apply Position
Job Link Share

Job Description

Capital One is looking for an Observability Platform Engineer to join our Associate Experience Endpoint Engineering team. Our desktop endpoint platforms represent ~70,000 devices, and we’re seeking an expert on helping to create better visibility around those devices. The engineering role will be focused on working with our Windows and Mac platform engineering teams to build dashboards, monitors, perform and present analysis, create data ingestion strategies, and tell stories that represent opportunities for improvements with our user experiences on those platforms.

Job Responsibility

  • Work with partner teams to update configurations for our log collectors on our Windows and Mac endpoints
  • Work with stakeholders to identify, discuss and prioritize log ingestion strategies
  • Build complex dashboards that tell stories about the health of our endpoints, and identify opportunities for improvements
  • Create monitors that alert platform teams when changes to the environment may be impacting the health of devices and user experiences
  • Create reports that detail the performance of applications on our endpoints, and applications being considered for future deployment
  • Assist platform teams with issue triage by providing complex data and log analysis where needed
  • Use data to tell stories to our senior leaders, help to drive vendor and product roadmaps
  • Help create processes and strategies that can validate changes in performance across operating system and product version updates

Requirements

  • High School Diploma, GED, or equivalent certification
  • At least 3 years of experience creating reports and building alert monitors
  • At least 3 years working with macOS and Windows platforms
  • Strong analytical and technical skills
  • Ability to foster collaborative, open, working relationships with technology groups and other stakeholders, including vendor relationships
  • Demonstrated clear communication skills and ability to interact effectively at all levels of an organization, and to influence senior management and executives
  • Strong knowledge of syntax structures for reporting languages, such as SQL or Opal, and good familiarity with parsing data.

Nice to have

  • Bachelor’s Degree
  • 3+ years of experience using alerting and monitoring tools to monitor fleet performance at scale, Aternity and Observe Telemetry tools preferred
  • 3+ years of experience with Tableau, Snowflake, Elastic, databricks, PowerBI & log analytics
  • Experience in regulated financial services organization
  • Working knowledge of Cyber, Network, and Identity solutions
  • 3+ years of experience of extracting data from Windows and/or Mac endpoints
  • 3+ years of experience working within an Agile environment

What we offer

  • Performance based incentive compensation, which may include cash bonus(es) and/or long term incentives (LTI)
  • A comprehensive, competitive, and inclusive set of health, financial and other benefits that support your total well-being

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Lead Observability Platform Engineer

8 matching positions

Lead Platform Engineer

This is an opportunity to lead teams, influence architecture, and still stay clo...
Location
Location
United Kingdom
Salary
Salary:
90000.00 - 110000.00 GBP / Year
linuxrecruit.co.uk Logo
Linux Recruit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Lead engineering teams and manage stakeholders effectively whilst remaining hands on technically
  • strong focus on AWS, Kubernetes including EKS and AKS, Terraform, Jenkins, and observability tooling such as Prometheus and OpenTelemetry
Job Responsibility
Job Responsibility
  • Work on large scale government and public sector programmes focused on modernising legacy estates, improving existing infrastructure, and building secure cloud native platforms across AWS and Kubernetes environments
  • break complex systems down into scalable microservices and deliver modern platform capabilities that improve reliability, scalability, and developer experience
  • help drive cloud optimisation across large and complex estates, applying a platform product mindset to build scalable and reusable solutions rather than one off fixes
  • lead engineering teams and manage stakeholders effectively whilst remaining hands on technically
  • play a key role in mentoring engineers, influencing technical direction, and helping define engineering standards across major programmes
What we offer
What we offer
  • unlimited training budget
  • actively encourages engineers to continue developing their professional expertise
  • supports engineers in achieving three Kubernetes certifications within their first year, fully funded and backed by structured internal support and mentoring
  • Fulltime
Read More
Arrow Right

Lead Observability Engineer

We are seeking a Lead Observability Engineer to join the team, and be able to wo...
Location
Location
Salary
Salary:
Not provided
n-ix.com Logo
N-iX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of engineering experience in cloud observability platforms, infrastructure, and telemetry systems
  • Deep experience in alerting, notifications, and monitoring at scale
  • Advanced expertise with ClickHouse, or similar high-performance analytical databases, for telemetry storage and querying
  • Hands-on experience migrating telemetry/storage solutions (preferably from Cosmos DB to ClickHouse or equivalent)
  • Solid understanding of telemetry pipelines, cloud-native monitoring, and best practices
  • Experience with dashboarding and visualization tools (Grafana, Kibana, or similar)
  • Strong scripting and automation skills (Python, Bash, Terraform or equivalent)
  • Proven collaboration and communication skills across cross-functional teams.
Job Responsibility
Job Responsibility
  • Lead the migration and transformation of telemetry storage from custom Cosmos DB solutions to ClickHouse, building a scalable and reliable end-to-end observability platform
  • Architect, implement, and maintain alerting and notification systems integrated with ClickHouse for critical services and applications
  • Develop, deploy, and operate high-throughput telemetry pipelines, ensuring accurate and actionable monitoring across cloud environments
  • Collaborate with engineering and product teams to define and champion observability best practices
  • Work with DevOps and development teams to automate collection, ingestion, and retention policies for logs, metrics, and traces
  • Drive continuous improvement in system performance, stability, and reliability through effective observability
  • Participate in on-call rotations, incident response, and root cause analysis to enhance monitoring and alerting capabilities.
What we offer
What we offer
  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits
  • Fulltime
Read More
Arrow Right

Lead Platform Engineer

We’re looking for a lead platform engineer who can make that a reality. You will...
Location
Location
United Kingdom
Salary
Salary:
90000.00 - 110000.00 GBP / Year
linuxrecruit.co.uk Logo
Linux Recruit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Cloud Experience: AWS alongside Azure and GCP as a bonus
  • Infrastructure as code: Terraform
  • CI/CD & automation: GitHub Actions, Jenkins or similar
  • Kubernetes & containers (EKS/AKS)
  • Observability & reliability: Prometheus, Grafana, OpenTelemetry
  • Platform Thinking: internal developer platforms, self servicing tool
  • Leadership experience: mentoring engineers and influencing technical direction
Job Responsibility
Job Responsibility
  • Delivering modernisation of clients' platforms
  • Moving legacy applications into microservices
  • Pushing environments in AWS and Kubernetes
What we offer
What we offer
  • Unlimited training budget
  • Encouraged to learn new technologies
  • Funding for Kubernetes certifications
  • Clear support to continue progressing
  • Fulltime
Read More
Arrow Right

Lead Platform Engineer

Are you looking for a role where you can lead teams, influence strategy, stay ha...
Location
Location
United Kingdom
Salary
Salary:
90000.00 - 110000.00 GBP / Year
linuxrecruit.co.uk Logo
Linux Recruit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Cloud – AWS at scale (Azure or GCP exposure a bonus)
  • Kubernetes & container platforms in production (EKS/AKS)
  • Infrastructure as Code – Terraform
  • CI/CD & automation – GitHub Actions, Jenkins or similar
  • Observability & reliability – Prometheus, Grafana, OpenTelemetry
  • Platform thinking – internal developer platforms, self service tooling
  • Leadership – mentoring engineers and influencing technical direction
Job Responsibility
Job Responsibility
  • Lead teams
  • influence strategy
  • stay hands on
  • design secure, cloud native platforms across AWS
  • build out multi account landing zones
  • scale Kubernetes environments
  • modernise legacy systems into microservices architectures
  • shape platform strategy
  • set standards across security, reliability, CI/CD and observability
  • work closely with stakeholders and engineering teams
What we offer
What we offer
  • Structured learning from day one
  • dedicated career coach
  • access to a huge range of training resources
  • full funding for certifications like CKA, CKAD and CKS
  • clear support to keep progressing
  • Fulltime
Read More
Arrow Right

Lead Engineer – Platform Engineering

We are looking for a Lead DevOps Engineer to join the Platform Engineering team ...
Location
Location
United States , St Petersburg, Florida
Salary
Salary:
Not provided
raymondjames.com Logo
Raymond James
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep experience with virtualization platforms (e.g., VMware vSphere/ESXi, Hyper‑V, KVM/Nutanix)
  • Hands‑on experience with configuration management tools such as Ansible
  • Implement and support enterprise load balancer solutions (e.g., F5 BIG-IP, NGINX, Azure/AWS load balancers), including configuration, automation, and traffic‑routing policies
  • Familiarity with AI‑assisted operations tools (AIOps), or how they can fit into the workflow
  • Solid understanding of CI/CD systems (GitHub Actions, Azure DevOps, Jenkins, GitLab CI)
  • Advanced scripting skills in Python, PowerShell, and/or Bash
  • Experience with provisioned workflow development in Service Now
  • Strong knowledge of monitoring and logging platforms (Prometheus/Grafana, Splunk, Elastic, Datadog, etc.)
  • Understanding of security best practices, IAM/RBAC, secrets management, and compliance frameworks
  • Strong networking and systems fundamentals (TCP/IP, DNS, load balancing, storage)
Job Responsibility
Job Responsibility
  • Design, build, and maintain automation for VM provisioning, configuration, and lifecycle management
  • Enhance and support CI/CD pipelines for infrastructure and platform services
  • Provide technical leadership and mentorship to engineers across the platform engineering team
  • Use AI‑assisted tooling when beneficial for anomaly detection, event correlation, and operational insights
  • Work on standardized VM images, templates, and OS baselines to ensure consistency and security
  • Improve platform reliability through monitoring, alerting, and SRE‑aligned practices
  • Develop and maintain observability tooling, dashboards, and automated remediation workflows
  • Ensure security best practices across VM platforms, including RBAC, secrets management, and patching
  • Optimize VM capacity, performance, and resource utilization across environments
  • Collaborate with development, cloud, and security teams to deliver stable, self‑service platform capabilities
  • Fulltime
Read More
Arrow Right

Lead Observability Engineer

Lead Observability Engineer role focusing on the Elastic Observability Platform,...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
blueyonder.com Logo
Blue Yonder
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, MIS, or equivalent experience
  • 7–10+ years of experience in observability engineering, SRE, monitoring platform ownership, or infrastructure operations
  • Deep, hands-on expertise with Elastic Stack (Elasticsearch, Kibana, Logstash, Beats/Elastic Agent, APM)
  • Strong architectural knowledge of cloud (Azure/AWS) and hybrid observability patterns
  • Experience leading observability for infrastructure, cloud platforms, network systems, Kubernetes, and Microsoft 365
  • Proven experience designing monitoring for SaaS platforms (Workday, Salesforce, ServiceNow)
  • Advanced scripting/automation experience (Python, PowerShell, Bash)
  • Strong knowledge of API integrations, data pipelines, and log-flow engineering
  • Experience leading incident diagnostics and delivering visibility for RCA and operational improvement
  • Strong analytical, architectural, and troubleshooting skills with a platform-owner mindset
Job Responsibility
Job Responsibility
  • Receives work assignments through the ticketing system or from senior leadership
  • Provides Tier-4 engineering expertise, platform ownership, and technical leadership for all observability capabilities across hybrid cloud, on-premises, and SaaS environments
  • Leads the design, architecture, and maturity of the enterprise observability ecosystem with a primary focus on the Elastic Observability Platform
  • Drives the enterprise strategy for logging, metrics, traces, synthetics, and alerting—including governance, standardization, and performance optimization
  • Partners closely with Cloud, Infrastructure, Security, Enterprise Applications, and SRE leadership to define observability frameworks
  • Ensures observability platforms meet enterprise requirements for security, performance, availability, compliance, and scalability
  • Oversees monitoring implementations for key SaaS applications including Workday, Salesforce, ServiceNow, and Microsoft 365
  • Provides guidance, mentorship, and direction to observability engineers, SREs, and operational teams
  • Acts as a strategic advisor during major incidents by providing real-time diagnostics, correlation insights, and driving RCA improvements
  • Required to provide on-call support during off-hours on weekdays, weekends, and holidays on a rotating basis
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Platform Observability

Everlaw is looking for a Senior Software Engineer that brings experience in buil...
Location
Location
United States , Oakland
Salary
Salary:
164000.00 - 208000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS or MS in Computer Science, or equivalent coursework
  • At least 3 years of experience building logging, metrics, and tracing infrastructure
  • Proficiency in coding in a language such as C, C++, C#, Java, Python, Javascript, Go or Rust
  • Experience with Infrastructure as Code and container solutions to manage cloud environments (ex: Terraform, Ansible, Docker, etc)
  • At least 1 year of experience leading multi-developer efforts, including planning, technical breakdown, and coordination
  • Excellent communication and collaboration skills
  • Please note that at this time, Everlaw is not sponsoring U.S. employment visas for this role. Due to federal contract requirements, Everlaw may only hire US citizens for this position.
Job Responsibility
Job Responsibility
  • Build observability strategies to support application and infrastructure metrics, logs, traces, dashboards, and alerts
  • Develop and maintain infrastructure as code (IAC) using tools such as Terraform and Ansible
  • Monitor usage trends to identify opportunities to optimize efficiency and performance of our metrics database and logging tools
  • Improve our on-call and incident management processes by encouraging deeper understanding, communication, and trust
  • Support developer projects by influencing design and implementation of infrastructure features as well as providing technical guidance
  • Support compliance efforts by promoting continuous documentation of our processes and involvement in audits
  • Provide Technical Mentorship to other engineers by both sharing your technical knowledge and becoming an expert in an area of our code base.
What we offer
What we offer
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Seventeen paid vacation days plus 11 federal holidays
  • Membership to Modern Health to help employees prioritize mental health and wellness
  • Annual allocation for Learning & Development opportunities and applicable professional membership dues
  • Company-sponsored life and disability insurance
  • Work in Uptown Oakland, just steps from the BART line and dozens of restaurants and walking distance to Lake Merritt
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Platform Observability

Everlaw is looking for a Senior Software Engineer that brings experience in buil...
Location
Location
United States , Oakland
Salary
Salary:
164000.00 - 239000.00 USD / Year
everlaw.com Logo
Everlaw
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BS or MS in Computer Science, or equivalent coursework
  • At least 3 years of experience building logging, metrics, and tracing infrastructure
  • Proficiency in coding in a language such as C, C++, C#, Java, Python, Javascript, Go or Rust
  • Experience with Infrastructure as Code and container solutions to manage cloud environments (ex: Terraform, Ansible, Docker, etc)
  • At least 1 year of experience leading multi-developer efforts, including planning, technical breakdown, and coordination
  • Excellent communication and collaboration skills that can motivate and move the team towards a common direction
  • Please note that at this time, Everlaw is not sponsoring U.S. employment visas for this role
  • Due to federal contract requirements, Everlaw may only hire US citizens for this position
Job Responsibility
Job Responsibility
  • Build observability strategies to support application and infrastructure metrics, logs, traces, dashboards, and alerts
  • Develop and maintain infrastructure as code (IAC) using tools such as Terraform and Ansible
  • Build custom libraries and plugins in Java and Python to allow engineers to generate meaningful metrics, logs and traces
  • Monitor usage trends to identify opportunities to optimize efficiency and performance of our metrics database and logging tools
  • Improve our on-call and incident management processes by encouraging deeper understanding, communication, and trust
  • Support developer projects by influencing design and implementation of infrastructure features as well as providing technical guidance
  • Support compliance efforts by promoting continuous documentation of our processes and involvement in audits
  • Provide Technical Mentorship to other engineers by both sharing your technical knowledge and becoming an expert in an area of our code base
  • Be a Code Reviewer by reviewing code developed by others using your knowledge of programming languages, design patterns, and best practices
  • Contribute to documentation for internal engineering consumption or for external the Everlaw platform
What we offer
What we offer
  • Competitive compensation
  • Comprehensive benefits package that includes medical, dental, wellness program
  • Paid parental leave
  • Professional development
  • Fully stocked kitchen
  • Equity program
  • 401(k) retirement plan with company matching
  • Health, dental, and vision
  • Flexible Spending Accounts for health and dependent care expenses
  • Paid parental leave and approximately 10 days (80 hours) per year of sick leave
  • Fulltime
Read More
Arrow Right