Staff Engineer, Site Reliability Lead, CVS Health

CVS Health

Location:
United States, Scottsdale

Category:
IT - Software Development

Contract Type:
Employment contract

Salary:

118450.00 - 236900.00 USD / Year

Save Job

Apply Position

Job Description:

The Staff Engineer, Site Reliability Engineer (SRE) will support CVS Health PCW Digital organization's vision to deliver transformative applications and technology platform services. This role partners with key stakeholders including Engineering, Architecture and Product to establish and adopt architectural best practices, guidelines, and standards towards providing Resilient systems with superior Customer Experience.

Job Responsibility:

Partner with Architects, Dev, Product and Business owners to ensure implementations are architected from production resiliency aspect
Work closely with Engineering teams during design phase, build and perform infrastructure upgrades
Real-Time troubleshooting of mission critical application workflows
Develop tools, frameworks, and instrumentation to validate and increase rollout success for applications
Evangelize, mentor, and coach teams across SRE and Front-Line Ops on best practices of Observability, Monitoring and SOPs

Requirements:

Min 7+ years of experience in Site Reliability or DevOps Engineering
7+ years of experience in Enterprise level Infrastructure and Operations
7+ years of experience with Logging As a Service components and ability to automate or build dashboards on Splunk, Loki or Logstash
7+ years of experience managing Application availability for 24x7 High availability platform
7+ years of Extensive experience with Observability and AIOps tools like Datadog, Grafana stack, Dynatrace and AppD
7+ years of experience with debugging skills across variety of integrated technical platforms on API gateway
5+ years of experience with Java, Python, JavaScript, React or NodeJS
5+ years of experience with tools like Rally, Confluence and other CI/CD extenders
5+ years of experience writing automation scripts and building dashboards for Application Performance management
5+ years of experience with proof of concepts and implementation for automation tools
5+ years of experience in utilizing GCP components like GKE, Stackdriver, pub/sub, Cloud functions, Cloud memorystore, Cloud Storage and Cloud run
3-5 years of Experience transitioning platforms to the cloud and Containerization – GCP, AWS and/or Rancher
2-3 years of working knowledge on with one or more databases- Oracle, SQL Server, MS-SQL, Postgres and MonoDB
2-4 years of extensive experience in Graph QL, Gateway or similar technologies
Must be flexibility to participate in on-call support rotation
Bachelor's degree in electrical engineering/computer science or closely related discipline

Nice to have:

Hands-on experience with implementing in-memory caching solutions
Experience on Redis DB
Experience in Kubernetes and Istio service mesh
Experience building Agentic AI, configuring LLMs like google Gemini, Mistral, llama, Qwen, Open-AI GPT or Hermes
Good business acumen and communications skills both written and verbal skills
Excellent at problem solving and organizational skills

What we offer:

Affordable medical plan options
401(k) plan with matching company contributions
Employee stock purchase plan
No-cost wellness screenings
Tobacco cessation and weight management programs
Confidential counseling and financial coaching
Paid time off
Flexible work schedules
Family leave
Dependent care resources
Colleague assistance programs
Tuition assistance
Retiree medical access

Additional Information:

Job Posted:
November 15, 2025

Expiration:
November 21, 2025

Employment Type:

Fulltime

Work Type:

Hybrid work

View All Jobs In This Company

Job Link Share:

Staff Engineer, Site Reliability Lead