CrawlJobs Logo

Senior Devops/SRE Engineer

India · Job Posted January 03, 2026
Apply Position
Job Link Share

Job Description

We’re hiring experienced Site Reliability Engineer / DevOps Engineer hybrids. You’ll be a key player in ensuring the health, reliability, and agility of over 200 mission-critical microservices—while shaping the future of AI-driven commerce.

Job Responsibility

  • Monitor, analyze, and enhance the health of 200+ distributed microservices
  • Own incident response and drive operational excellence as a member of our 24/7 SRE on-call rotation, ensuring uptime and meeting strict SLAs
  • Deliver key DevOps outcomes—CVEs, SWUs, software upgrades, automated failover, resilience engineering, robust security design, and infrastructure improvements
  • Collaborate cross-functionally to design, implement, and maintain monitoring, alerting, and automation frameworks
  • Build standardized tooling and practices supporting rapid recovery, continuous improvement, and compliance
  • Develop in Java for backend, including debugging, optimizing, and maintaining high-availability server-side applications and distributed systems

Requirements

  • Strong experience in Site Reliability Engineering, DevOps, or Production Operations—preferably supporting large-scale systems
  • Hands-on expertise with cloud infrastructure (AWS, GCP or Azure), containers and orchestration, CI/CD, monitoring stacks, automation
  • Working knowledge of Java
  • Solid understanding of incident management, reliability engineering, and microservice architectures
  • Willingness to participate in a rotating 24/7 on call schedule
  • Excellent problem-solving, communication, and teamwork skills

Nice to have

Background in security best practices, system resilience, and disaster recovery is a plus

What we offer

  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Devops/SRE Engineer

8 matching positions

Senior Software Engineer - SRE

Roku is changing how the world watches TV. Roku is the #1 TV streaming platform ...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
roku.com Logo
Roku
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Preferably 8+ years of experience in DevOps/SRE roles, with demonstrated expertise in implementing SRE principles, SLO/SLI frameworks, and error budget policies in production environments
  • Deep experience with observability and monitoring platforms such as Prometheus, Grafana, Datadog, New Relic, or equivalent, including experience building custom dashboards, alerts, and SLO-based monitoring
  • Strong background in incident management, including experience as an Incident Commander, conducting blameless postmortems, and implementing systematic reliability improvements based on incident learnings
  • Strong understanding of distributed systems and reliability engineering, including failure modes, fault tolerance patterns, circuit breakers, bulkheads, rate limiting, and graceful degradation strategies
  • Experience with a number of the following: Kubernetes, Docker, Service Mesh such as Istio, Envoy, Linkerd, Solo & ECS
  • Experience in cloud-focused software development, preferably in Go, Python, or other object-oriented programming languages
  • Experience with Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation
  • Experience with CI/CD automation, including GitLab pipelines and other related tools
  • Strong hands-on experience with cloud platforms such as AWS, GCP or Azure
  • Proven track record of implementing scalable, high-performance infrastructure solutions in fast-paced, dynamic environments
Job Responsibility
Job Responsibility
  • Design & Infrastructure
  • Contribute to postmortem culture by facilitating comprehensive, blameless post-incident reviews that identify root causes, contributing factors, and actionable remediation items. Track incident trends to identify systemic issues and prioritize reliability improvements
  • Implement chaos engineering practices to proactively identify failure modes, validate system resilience, and build confidence in recovery procedures. Conduct game days and disaster recovery exercises
  • SRE Process & Principles Implementation
  • Deploy and evolve SRE practices across the organization by establishing core SRE principles, frameworks, and methodologies. Define and implement service reliability practices, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets, to balance innovation velocity with system reliability
  • Manage Error Budgets as a mechanism for making data-driven decisions about feature velocity vs. reliability. Track, report, and enforce error budget policies, facilitating conversations between engineering and product teams about risk tolerance and release decisions
  • Reliability Engineering & Infrastructure
  • Reduce toil through automation by identifying repetitive operational work and systematically eliminating it through infrastructure-as-code, automation frameworks, and intelligent tooling. Measure and track toil reduction efforts, aiming to keep toil below 50% of team time
  • Implement capacity planning processes that ensure systems have adequate headroom to meet SLOs during peak traffic, unexpected load spikes, and degraded states. Develop predictive models and automated scaling mechanisms
  • Observability, Monitoring & Reporting
What we offer
What we offer
  • global access to mental health and financial wellness support and resources
  • healthcare (medical, dental, and vision)
  • life, accident, disability, commuter, and retirement options (401(k)/pension)
  • time off in accordance with local leave policies
  • Fulltime
Read More
Arrow Right

Senior DevOps Engineer

We are looking for a highly skilled Senior DevOps Engineer to design, implement,...
Location
Location
India , Ahmedabad
Salary
Salary:
Not provided
codezeros.com Logo
Codezeros
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in DevOps/SRE roles
  • Strong hands-on experience with both AWS and Azure
  • Expertise in Linux system administration
  • Proficiency in scripting (Bash, Python, or Go)
  • AWS Services: EC2, S3, RDS, Lambda, VPC, IAM
  • Azure Services: VM, Blob Storage, Azure Functions, VNet, Azure AD
  • CI/CD pipelines and automation
  • Infrastructure as Code (Terraform preferred)
  • Containerization and orchestration (Docker + Kubernetes)
  • Networking fundamentals (DNS, TCP/IP, Load Balancing)
Job Responsibility
Job Responsibility
  • Design and manage scalable, fault-tolerant systems on AWS and Azure
  • Optimize cloud cost, performance, and security
  • Implement high availability, disaster recovery, and backup strategies
  • Build and maintain CI/CD pipelines using tools like (Jenkins, Github Actions, Azure DevOps)
  • Automate build, test, and deployment processes
  • Develop and manage infrastructure using (Terraform, AWS CloudFormation, Azure Resource Manager)
  • Ensure version-controlled and reusable infrastructure modules
  • Manage containerized workloads using (Docker, Kubernets)
  • Deploy and manage clusters on (AWS EKS, Azure AKS)
  • Implement observability solutions using (Prometheus, Grafana, ELK Stack, Cloud-native tools (CloudWatch, Azure Monitor))
  • Fulltime
Read More
Arrow Right

Senior Software Engineer (Cloud & DevOps)

At 3Shape, we use cloud platforms to deliver secure, reliable services to both i...
Location
Location
Denmark , Copenhagen
Salary
Salary:
Not provided
3shape.com Logo
3Shape
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5 years of experience, of which Minimum 3 years of professional C#/.NET backend development experience, ideally in a cloud environment.
  • Minimum 3 years of hands-on DevOps/SRE experience, ideally in a role combining software development and operations.
  • Strong backend engineering fundamentals (design, performance, security, and maintainability).
  • Experience with API design, automated testing, code reviews, and building maintainable systems.
  • Experience with containerized workloads and Kubernetes (e.g., Azure Kubernetes Service).
  • Curiosity for modern engineering practices and a strong understanding of core Azure concepts (networking, compute, storage, identity, and databases).
  • Experience with monitoring/observability in Azure (e.g., Azure Monitor, Application Insights, Log Analytics) and incident handling is a plus.
  • A strong ownership mindset: automation-first, focus on reliability, and continuous improvement of quality and stability.
Job Responsibility
Job Responsibility
  • Design, implement, and maintain backend services in the Account domain, delivering features end-to-end from implementation and testing to deployment readiness.
  • Be the team's primary point of contact for DevOps topics and drive improvements across CI/CD, AKS/Kubernetes, Infrastructure as Code, observability, and platform stability.
  • Collaborate with platform and product teams across 3Shape to align on Azure standards and best practices especially around Infrastructure as Code, observability, and operational readiness.
  • Help define actionable alerts and dashboards, improve runbooks, and build safe automation so incidents can be detected, triaged, and mitigated quickly even outside normal working hours.
What we offer
What we offer
  • Central Copenhagen location
  • An attractive healthcare package to keep you fit and well.
  • Breakfast every day, and a delicious and healthy lunch cooked by our private chefs.
  • A joint purpose: to enable dentists to provide superior dental care to every patient, every time.
  • Fulltime
Read More
Arrow Right

Senior Automation Engineer

Cloud Technology Services – External Job Description. About Citi Cloud Technolog...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6–10 years of experience with Windows and RHEL server platforms and system integration
  • Strong background in enterprise storage engineering or administration
  • Expertise in DevOps tooling: Jenkins, Git, Ansible
  • CI/CD integration experience
  • Hands‑on experience with Dell‑EMC, Hitachi storage arrays, Cisco MDS switches, and Emulex HBAs
  • Strong written and verbal communication skills
  • ability to work in a global, matrixed environment
  • Experience with Agile, DevOps/SRE methodologies, and automation scripting (PowerShell, Ansible)
  • Proven ability to independently manage multiple tasks and drive engineering improvements
  • Ability to collaborate with teams across global time zones
Job Responsibility
Job Responsibility
  • Engineer, integrate, and maintain CI/CD pipelines for enterprise storage software from Dell‑EMC, Hitachi, Cisco MDS, Emulex, and Veritas
  • Evaluate and certify HBA cards, drivers, and firmware for Windows Server (2016–2025), RHEL 8/9/10, and VMware ESX
  • Diagnose and resolve SAN stack compatibility issues impacting OS upgrades across Dell and HPE servers
  • Produce clear, validated engineering documentation and updated SAN Stack standards
  • Automate testing and configuration validation using Ansible and PowerShell
  • Support engineering teams with cluster testing for next‑generation Emulex HBAs (16/32Gb)
  • Lead technical discussions with Emulex and other vendors to investigate vulnerabilities and enhance driver/firmware stability
  • Develop automated test plans using Citi’s internal integration test environment
  • Raise deployment requests for validated driver/firmware packages and maintain internal validation scripts
  • Ensure all work meets audit, compliance, and information security requirements
  • Fulltime
Read More
Arrow Right

Senior DevOps Engineer

We are seeking a highly skilled Senior DevOps Engineer with 5+ years of experien...
Location
Location
Spain , Barcelona
Salary
Salary:
Not provided
abacum.ai Logo
Abacum Inc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in DevOps/SRE (or a comparable role)
  • Deep knowledge of cloud systems, specially AWS
  • Designing, building, and maintaining CI/CD pipelines using cloud-native tools like GitHub Actions
  • Strong proficiency in IaC (Terraform is a plus)
  • Strong proficiency building and scaling Kubernetes clusters
  • Expertise in Bash/Posix and a high-level language like Python for automation, API interaction, and custom tooling
  • Strong proficiency architecting monitoring system (Datadog is a plus)
  • Versatile (hands-on doer at ease from engineering strategy to low level details, starter attitude)
  • Empathetic and humble (knows how to speak up with respect, capacity to compromise and how to commit in disagreement when needed)
Job Responsibility
Job Responsibility
  • Design and implement our systems to be efficient, scalable, accountable, and secure
  • Team up with other Engineers to perform experiments and test new ideas
  • Build a strong DevOps culture and tooling that enable our delivery teams to be autonomous while providing best practices (security, observability, scalability, performance, etc.)
  • Deploy and manage our infrastructure provisioning
  • Develop and drive real time observability solutions that provide visibility into system health
  • Provide technical guidance and educate team members and coworkers on operations and cloud best practices
  • Continuously improve development delivery CI/CD
  • Ability to develop and implement security measures related to the development processes and operational needs driven by our security and compliance team
  • Build and scale our Kubernetes clusters and workloads
  • Manage and scale our cloud databases
What we offer
What we offer
  • Competitive compensation including equity package
  • Competitive vacation policy
  • Access to Meditopia
  • Hybrid working model and flexible working hours
  • Personal development including language courses
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer

We’re hiring a Senior Platform Engineer to join the Platform team. This is a str...
Location
Location
United States , Remote; Honolulu; San Diego; Seattle; Colorado Springs; Austin; Washington; Tampa
Salary
Salary:
180000.00 - 230000.00 USD / Year
onebrief.com Logo
Onebrief
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in Platform, DevOps, or Site Reliability Engineering with an infrastructure and operations focus
  • Proven partner to DevOps/SRE and application teams
  • collaborates well across functions and shares context openly
  • Clear, concise writing
  • strong documentation habits and async communication
  • Infrastructure as Code: Terraform (or CloudFormation), Ansible
  • Containers and orchestration: Docker
  • Kubernetes design, deployment, and operations
  • CI/CD: experience building and maintaining pipelines (GitLab CI/CD, Jenkins, GitHub Actions)
  • Scripting: proficiency with at least one of Python, Go, or Bash
Job Responsibility
Job Responsibility
  • Building and automating the platform: Design, provision, and manage cloud and on‑prem environments with Terraform and Ansible
  • operate secure, resilient Kubernetes clusters for multi‑level deployments
  • Designing and evolving CI/CD: Reduce lead time and maintain strict gates and environment parity by improving pipelines in GitLab CI/CD, Jenkins, and GitHub Actions
  • Upholding reliability, security, and compliance: Partner with SREs to implement monitoring, logging, and alerting (Prometheus, Grafana, ELK/Datadog)
  • embed RMF and STIG controls into automation and day‑to‑day operations
  • Operating and optimizing our footprint: Manage AWS/Azure/GCP resources and on‑prem networks. Balance cost, performance, and security while unblocking product teams
  • Supporting the team, documenting, and mentoring: Triage and resolve complex platform issues
  • write clear docs for architectures and runbooks
  • share context and mentor teammates
What we offer
What we offer
  • Flexible Work Environment: Remote work with flexible hours and unlimited PTO
  • Comprehensive Health Coverage: Health, dental, vision, and life insurance
  • Retirement Plan: 401(k) plan to secure your future
  • Parental Leave: 8 weeks at 100% regardless of state
  • Company Retreats: Annual company summit trips
  • Home Office Budget: $1,000 per year for home office improvements
  • Offers Equity
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer

We’re building a world-class engineering organization, and this role is a founda...
Location
Location
United States , New York
Salary
Salary:
170000.00 - 205000.00 USD / Year
campus.edu Logo
CampusGroup
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5–10 years of experience as a backend engineer, platform engineer, DevOps/SRE engineer, or data engineer
  • Experience building and maintaining production services, pipelines, and cloud infrastructure at scale
  • Strong in Go and designing backend systems for reliability and clarity
  • Hands-on experience with GCP, Terraform, Docker, and CI/CD tooling
  • Deep care about reliability, performance, graceful degradation, and observability
  • Experience with data pipelines, warehouses, or operational data workflows
  • Curious and growth-oriented, always learning, improving, and raising the engineering bar
  • Communicates clearly and works collaboratively with low ego
Job Responsibility
Job Responsibility
  • Ship backend systems that power admissions, financial aid, registrar workflows, reporting, and real-time student experiences
  • Build and maintain data pipelines and ingestion flows supporting analytics, product, and operational automation
  • Own CI/CD and deployment workflows to ensure fast, reliable releases across backend, mobile, and frontend
  • Design and scale our cloud infrastructure using GCP and Terraform
  • Improve developer experience and data scientist experience — tools, environments, testing workflows, and internal automations
  • Partner across engineering (backend, mobile, data, product) to accelerate delivery and reduce operational friction
  • Strengthen security & compliance through better access control, audit-ready logging, secrets management, and hardened infrastructure
What we offer
What we offer
  • Medical, dental, and vision insurance
  • 401(k) match
  • Fertility benefits via Carrot
  • Flexible Time Away + paid holidays
  • In-office lunches for our NY Office
  • Hybrid work schedule (Mon & Fri remote
  • Tues-Thurs in-office)
  • Social events - happy hours, birthday celebrations, holiday parties, & more!
  • Opportunity to make an impact – you’ll be an integral player in bringing our vision to life
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Devops

We are seeking a talented and experienced DevOps/SRE (Site Reliability Engineeri...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
roku.com Logo
Roku
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in DevOps/SRE roles
  • Experience in cloud-focused software development, preferably in Go, Python, or other object-oriented programming languages
  • Experience with a number of the following: ECS, Docker, Kubernetes, Envoy, Istio, Linkerd, Solo
  • Experience with Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation
  • Strong understanding of distributed systems, microservices architecture, and cloud-native technologies
  • Strong proficiency in cloud platforms such as AWS, Azure, or GCP
  • Solid understanding of networking, security, and compliance principles
  • Proven track record of driving results and delivering high-quality solutions in a fast-paced environment
  • Demonstrated ability to communicate clearly with both technical and non-technical project stakeholders
  • BS Degree in Computer Science or Equivalent
Job Responsibility
Job Responsibility
  • Oversee the design, implementation, and maintenance of scalable and resilient cloud infrastructure on platforms spanning AWS and GCP
  • Ensure high availability, reliability, and performance of critical systems
  • Collaborate with your peers to be responsible for the entire software lifecycle
  • Manage individual project priorities, deadlines, and deliverables related to your technical expertise and assigned domains
  • Lead incident response efforts
  • Implement effective incident management processes and post-incident reviews
  • Collaborate with security teams to ensure the integrity and security of infrastructure and applications
  • Identify performance bottlenecks and optimize system resources for maximum efficiency
  • Conduct regular performance tuning and capacity planning exercises
  • Drive continuous improvement initiatives within the team and across the organization
What we offer
What we offer
  • Global access to mental health and financial wellness support and resources
  • Local benefits include statutory and voluntary benefits which may include healthcare (medical, dental, and vision), life, accident, disability, commuter, and retirement options (401(k)/pension)
  • Vacation and other personal time off
  • Fulltime
Read More
Arrow Right