Principal Architect - Cloud and Observability Job at CVS Health

Principal Architect - Cloud and Observability

CVS Health

Location:
United States

Category:
IT - Administration

Contract Type:
Employment contract

Salary:

144200.00 - 288400.00 USD / Year

Save Job

Apply Position

Job Description:

We're building a world of health around every individual — shaping a more connected, convenient and compassionate health experience. At CVS Health®, you'll be surrounded by passionate colleagues who care deeply, innovate with purpose, hold ourselves accountable and prioritize safety and quality in everything we do. Join us and be part of something bigger – helping to simplify health care one person, one family and one community at a time. Position Summary We're hiring a Principal Architect to take ownership of how we do observability and hybrid cloud at CVS Health. This person will sit within our Enterprise Architecture organization and be responsible for the architecture, standards, and technical direction behind our observability platforms and our multi-cloud infrastructure posture. We run workloads across on-prem private cloud (OpenShift, KVM, Dell PowerFlex), Azure, AWS, and GCP. We need someone who can build and maintain the reference architectures, telemetry standards, and instrumentation patterns that let our engineering teams monitor all of that consistently. We've committed to an OpenTelemetry-first approach and use the Grafana stack (Mimir, Loki, Tempo) as our primary backends, but we also operate Datadog, Splunk, and Dynatrace in various parts of the org. On the cloud side, there is real work to do around workload identity, runtime selection, autoscaling guidance, and FinOps. Teams are asking for concrete standards they can follow. This is a hands-on role. You'll write architecture docs, build proof-of-concepts, configure OTel pipelines, and present to leadership. *This position can work remotely from anywhere in the continental USA.

Job Responsibility:

Own the enterprise observability reference architecture covering metrics, logs, traces, and events across all environments (cloud and on-prem)
Drive the OpenTelemetry-first instrumentation strategy -- standard libraries, semantic conventions, collector topologies (DaemonSet, gateway, sidecar), and pipeline design
Build and operate telemetry pipelines on Grafana Mimir, Loki, and Tempo, including multi-tenant configurations, retention policies, and capacity planning
Define how we measure reliability: SLOs, SLIs, error budgets, and alerting frameworks -- consistently across all lines of business
Own the integration between observability tooling and incident management (ServiceNow ITOM, xMatters)
Drive telemetry schema standards to ensure teams emit data that is useful downstream, not just technically compliant
Build and maintain reference architectures for our hybrid footprint: OpenShift on-prem with KVM/libvirt and Dell PowerFlex storage, plus Azure, AWS, and GCP
Lead standards work around workload identity and federation using SPIFFE/SPIRE and cloud-native IAM patterns to move away from static secrets
Provide guidance on compute runtime selection -- containers vs. VMs vs. bare metal vs. serverless -- with a clear decision framework for teams
Help teams connect autoscaling and capacity planning behavior to actual telemetry signals
Push FinOps maturity forward by integrating cost data into the observability stack, establishing unit economics, and working toward open billing standards like FOCUS
Identify where AI/ML adds practical value in our observability stack -- anomaly detection, root cause analysis, log clustering, and smarter alerting
Define observability standards for AI-powered systems (agents, RAG pipelines) -- covering latency, token costs, model drift, and related signals
Ensure new AI-powered platforms are instrumented correctly from day one
Participate in cross-functional architecture working groups focused on observability and hybrid cloud standards
Publish architecture decision records and reference implementations that teams can actually use
Mentor architects and platform engineers
conduct architecture reviews to raise the bar across the org
Work with security and compliance on HIPAA, SOX, and PCI requirements as they apply to telemetry and cloud infrastructure
Represent CVS Health in vendor evaluations and stay connected to the open-source ecosystem (CNCF, OpenTelemetry, Grafana Labs)

Requirements:

10+ years in infrastructure, cloud architecture, platform engineering, or SRE
8+ years of architecture work in observability, cloud infrastructure, or both at a large enterprise
Solid experience with at least two of Azure, AWS, or GCP -- including networking, identity, compute, and storage
5+ years with Kubernetes in production (OpenShift, EKS, AKS, or GKE)
5+ years with OpenTelemetry or similar frameworks (collectors, SDKs, semantic conventions, pipeline design)
5+ years with observability platforms: Grafana/Mimir/Loki/Tempo, Prometheus, Datadog, Splunk, Dynatrace, or comparable tools
Experience defining SLOs/SLIs and building alerting strategies at an organizational level
Proven track record writing architecture standards that other teams adopted and followed
Able to communicate clearly with both engineers and senior leadership

Nice to have:

On-prem / private cloud experience (OpenShift Virtualization, KVM/libvirt, VMware, Dell PowerFlex or similar storage)
Workload identity (SPIFFE/SPIRE) and zero-trust networking
Infrastructure-as-code (Terraform, Pulumi, Helm, ArgoCD)
Streaming platforms such as Kafka or Confluent, especially in telemetry pipeline contexts
AIOps or ML-based anomaly detection experience
FinOps background -- cloud cost optimization, chargeback, unit economics
Service mesh (Istio, Envoy, Linkerd) or eBPF-based tools (Cilium, Pixie)
Involvement in open-source communities (CNCF, OpenTelemetry, etc.)
Healthcare, insurance, or financial services experience (HIPAA/SOX familiarity)
Cloud certifications are a plus but not required

What we offer:

medical, dental, and vision coverage
paid time off
retirement savings options
wellness programs
other resources, based on eligibility
bonus, commission or short-term incentive program
equity award program

Additional Information:

Job Posted:
April 24, 2026

Expiration:
June 29, 2026

Employment Type:

Fulltime

Work Type:

Remote work

CVS Health - All Job Offers

Job Link Share:

Principal Architect - Cloud and Observability

CVS Health

Location:
United States

Category:
IT - Administration

Contract Type:
Employment contract

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
April 24, 2026

Expiration:
June 29, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Principal Architect - Cloud and Observability

Principal Data Architect

Principal AI Architect

Principal Engineer

Principal Data Engineer

Principal Engineer I - Cloud Observability

Principal Customer Success Manager

Principal Software Engineer, AI Cloud

Principal Architect, Core Platform

Our AI answers in your language

Principal Architect - Cloud and Observability

CVS Health

Location:United States

Category:IT - Administration

Contract Type:Employment contract

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:April 24, 2026

Expiration:June 29, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Principal Architect - Cloud and Observability

Principal Data Architect

Principal AI Architect

Principal Engineer

Principal Data Engineer

Principal Engineer I - Cloud Observability

Principal Customer Success Manager

Principal Software Engineer, AI Cloud

Principal Architect, Core Platform

Location:
United States

Category:
IT - Administration

Contract Type:
Employment contract

Job Posted:
April 24, 2026

Expiration:
June 29, 2026