Infrastructure Site Reliability Engineer, CVS Health

CVS Health

Location:
United States

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

118450.00 - 260590.00 USD / Year

Save Job

Apply Position

Job Description:

As an Infrastructure Site Reliability Engineer, you will be responsible for designing, implementing, and managing the infrastructure systems and tools that enable reliability and performance of our technology platforms supporting various business initiatives within CVS Health. This role requires a strong background in infrastructure engineering and a commitment to proactive monitoring, troubleshooting, and optimizing systems for maximum uptime and performance.

Job Responsibility:

Manage and maintain various systems and infrastructure, such as servers, storage, mainframe, iSeries, backup, archive, and recovery
Participate in on-call rotation to ensure availability and uptime of critical systems
Develop and maintain best practices documentation, including system architecture diagrams, standard operating procedures, and runbooks
Perform system and application performance analysis
Streamline and optimize operational processes, procedures, and documentation
Develop, modify, and implement incident and problem management processes
Establish comprehensive SRE process
Collaborate with development teams to participate in code reviews, performance optimization, and application deployment processes
Drive reliability engineering practices, including monitoring, alerting, incident management, capacity planning, and disaster recovery
Automate infrastructure deployments, upgrades, and maintenance tasks
Analyze historical usage patterns and growth projections to forecast future capacity requirements
Establish and maintain monitoring systems to track performance and utilization

Requirements:

7+ years of experience in Infrastructure Engineering, System Administration, or related roles
3+ years of experience with cloud platforms (e.g., Amazon Web Services, Microsoft Azure) and infrastructure-as-code tools (e.g., Terraform, CloudFormation)
2+ years of experience in at least one configuration management tool such as Ansible, Puppet, or Chef
2+ years of experience with containerization technologies such as Docker and container orchestration platforms like Kubernetes
2+ years of experience in networking principles and protocols, including TCP/IP, DNS, load balancing, and firewalls
1+ years of experience with incident management, performance monitoring, and capacity planning tools
Bachelor's degree or equivalent experience (High School Diploma and 4 years relevant experience)

Nice to have:

Excellent troubleshooting and problem-solving skills, with the ability to identify, communicate, and resolve technical issues swiftly

What we offer:

Affordable medical plan options
401(k) plan with matching company contributions
Employee stock purchase plan
Wellness screenings
Tobacco cessation and weight management programs
Confidential counseling and financial coaching
Paid time off
Flexible work schedules
Family leave
Dependent care resources
Colleague assistance programs
Tuition assistance
Retiree medical access

Additional Information:

Job Posted:
November 10, 2025

Expiration:
November 12, 2025

Employment Type:

Fulltime

Work Type:

Remote work

View All Jobs In This Company

Job Link Share:

Infrastructure Site Reliability Engineer