This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Staff Engineer, Site Reliability Engineer (SRE) will support CVS Health PCW Digital organization's vision to deliver transformative applications and technology platform services. This role partners with key stakeholders including Engineering, Architecture and Product to establish and adopt architectural best practices, guidelines, and standards towards providing Resilient systems with superior Customer Experience.
Job Responsibility:
Partner with Architects, Dev, Product and Business owners to ensure implementations are architected from production resiliency aspect
Work closely with Engineering teams during design phase, build and perform infrastructure upgrades
Real-Time troubleshooting of mission critical application workflows
Develop tools, frameworks, and instrumentation to validate and increase rollout success for applications
Evangelize, mentor, and coach teams across SRE and Front-Line Ops on best practices of Observability, Monitoring and SOPs
Requirements:
Min 7+ years of experience in Site Reliability or DevOps Engineering
7+ years of experience in Enterprise level Infrastructure and Operations
7+ years of experience with Logging As a Service components and ability to automate or build dashboards on Splunk, Loki or Logstash
7+ years of experience managing Application availability for 24x7 High availability platform
7+ years of Extensive experience with Observability and AIOps tools like Datadog, Grafana stack, Dynatrace and AppD
7+ years of experience with debugging skills across variety of integrated technical platforms on API gateway
5+ years of experience with Java, Python, JavaScript, React or NodeJS
5+ years of experience with tools like Rally, Confluence and other CI/CD extenders
5+ years of experience writing automation scripts and building dashboards for Application Performance management
5+ years of experience with proof of concepts and implementation for automation tools
5+ years of experience in utilizing GCP components like GKE, Stackdriver, pub/sub, Cloud functions, Cloud memorystore, Cloud Storage and Cloud run
3-5 years of Experience transitioning platforms to the cloud and Containerization – GCP, AWS and/or Rancher
2-3 years of working knowledge on with one or more databases- Oracle, SQL Server, MS-SQL, Postgres and MonoDB
2-4 years of extensive experience in Graph QL, Gateway or similar technologies
Must be flexibility to participate in on-call support rotation
Bachelor's degree in electrical engineering/computer science or closely related discipline
Nice to have:
Hands-on experience with implementing in-memory caching solutions
Experience on Redis DB
Experience in Kubernetes and Istio service mesh
Experience building Agentic AI, configuring LLMs like google Gemini, Mistral, llama, Qwen, Open-AI GPT or Hermes
Good business acumen and communications skills both written and verbal skills
Excellent at problem solving and organizational skills
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.