Senior Service Reliability Engineer Job at Plusnet (Gurugram)

Senior Site Reliability Engineer

This is a role at Baxter where your work impacts saving and sustaining lives thr...

Location

United States , Deerfield

Salary:

96000.00 - 132000.00 USD / Year

Baxter

Expiration Date

Until further notice

Requirements

Bachelor's degree in computer science, IT, or related field (or equivalent experience)
Prior experience in Site Reliability Engineering and cloud-based infrastructure management
Experience in enterprise engineering, including 24x7 uptime, regulated environments, and planning/operations
Azure administration and operations experience, with certifications a plus
Knowledge of related technologies, including cloud, encryption, and security protocols
Systems administration experience in Windows and Linux environments
Proven problem-solving skills and experience with scripting and automation tools
Ability to create accurate documentation and reports, with excellent communication skills
Applicants must be authorized to work for any employer in the U.S.
Unable to sponsor or take over sponsorship of an employment visa at this time.

Job Responsibility

Drive strategies to ensure 24x7 availability of services and business continuity for customer-facing healthcare software applications and platforms hosted on Microsoft Azure cloud
Manage and administer Azure resources, including virtual machines, databases, and networking components
Define and document operating procedures to ensure required security, privacy and other compliance standards are maintained for digital solutions deployed in cloud
Manage process, planning, and execution for Disaster Recovery (DR) and Business Continuity Planning (BCP)
Define and refine Operations SLAs to maintain high level of Customer Satisfaction
Establish non-functional requirements to meet SLAs
Establish infrastructure and application monitoring dashboards and workflow for automatic routing of notifications
Define key performance indicators that can be monitored, measured, and used to derive opportunities
Standardize site metrics for stakeholders, reporting on various KPIs including SLAs, availability, capacity utilization, service metrics and cost utilization
Work closely with DevOps Engineers to automate infrastructure provisioning and deployment processes.

What we offer

Support for Parents
Continuing Education/Professional Development
Employee Health & Well-Being Benefits
Paid Time Off
2 Days a Year to Volunteer
Medical and dental coverage starting day one
Insurance coverage for basic life, accident, short-term and long-term disability
Business travel accident insurance
Employee Stock Purchase Plan (ESPP)
401(k) Retirement Savings Plan

Fulltime

Senior Site Reliability Engineer

We are looking for a Senior Site Reliability Engineer who is passionate about sc...

Location

Salary:

Not provided

Atlassian

Expiration Date

Until further notice

Requirements

5+ years experience operating high-availability, fault-tolerant, scalable, distributed software in production: building monitoring, tweaking dashboards, defining alerts, writing runbooks, etc.
5+ years of hands on experience with public cloud offerings (AWS components like EC2, CloudFormation, RDS / Aurora, Caches, SQS - or equivalents, e.g. in GCP / Azure)
Familiarity with Unix / Linux operating systems
Strong emphasis to debug, improve code, and automate routine tasks
Strong backend engineering experience in one or more prominent languages such as Java, Go or Python
Excellent communication skills in written and verbal forms, and an ability to communicate complex technical issues to a range of technical and non-technical audiences (management, peers, clients)
An ability and desire to mentor and coach engineers

What we offer

health coverage
paid volunteer days
wellness resources

Fulltime

Senior Site Reliability Engineer

Architect, develop, and troubleshoot large-scale infrastructure, maintain and im...

Location

United States , San Francisco

Salary:

180960.00 - 230900.00 USD / Year

Atlassian

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Software Engineering, Information Technology or a closely related field
four years of experience as a Site Reliability Engineer architecting, developing, and troubleshooting large scale infrastructure utilizing programming languages such as PowerShell, Python, or Bash
networking technologies such as TCP/IP or security
four years of experience in automation development and infrastructure as code implementation using tools such as Terraform, AWS CloudFormation, Ansible, or Salt
knowledge of Linux and Windows systems
cloud technologies within AWS, GCP, Azure
continuous integration continuous delivery/deployment (CICD) practices and monitoring and observability practices
must pass technical interview

Job Responsibility

Architect, develop, and troubleshoot large scale infrastructure utilizing programming languages such as PowerShell, Python, or Bash and networking technologies such as TCP/IP or security
provide real-time feedback on production systems
work with product family and platform developers to maintain and improve services and performance with a strong customer focus
utilize a variety of data collection, enrichment, analytics, and visualizations to support our complex systems
responsible for automation development and infrastructure-as-code implementation using tools such as Terraform, AWS CloudFormation, Ansible, and/or Salt
build solutions to enhance availability, performance, and stability for hundreds of Atlassian enterprise customers in the cloud as well as automate repetitive work
help secure the cloud architecture with penetration testing, vulnerability resolution, and compliance audit responses
responsible for continuous integration continuous delivery/deployment (CICD) practices and monitoring and observability practices

What we offer

Health and wellbeing resources
paid volunteer days

Fulltime

Senior Site Reliability Engineer

Baxter International is seeking a skilled Senior Principal Site Reliability Engi...

Location

United States , Deerfield

Salary:

96000.00 - 132000.00 USD / Year

Baxter

Expiration Date

Until further notice

Requirements

Bachelor's degree in computer science, IT, or related field (or equivalent experience)
Prior experience in Site Reliability Engineering and cloud-based infrastructure management
Experience in enterprise engineering, including 24x7 uptime, regulated environments, and planning/operations
Azure administration and operations experience, with certifications a plus
Knowledge of related technologies, including cloud, encryption, and security protocols
Systems administration experience in Windows and Linux environments
Proven problem-solving skills and experience with scripting and automation tools
Ability to create accurate documentation and reports, with excellent communication skills

Job Responsibility

Drive strategies to ensure 24x7 availability of services and business continuity for customer facing healthcare software applications and platforms hosted on Microsoft Azure cloud
Manage and administer Azure resources, including virtual machines, databases, and networking components
Define and document operating procedures to ensure required security, privacy and other compliance standards are maintained for digital solutions deployed in cloud
Manage process, planning, and execution for Disaster Recovery (DR) and Business Continuity Planning (BCP)
Define and refine Operations SLAs to maintain high level of Customer Satisfaction
Establish non-functional requirements to meet SLAs
Establish infrastructure and application monitoring dashboards and workflow for automatic routing of notifications
Define key performance indicators that can be monitored, measured, and used to derive opportunities
Standardize site metrics for stakeholders, reporting on various KPIs including SLAs, availability, capacity utilization, service metrics and cost utilization
Work closely with DevOps Engineers to automate infrastructure provisioning and deployment processes

What we offer

Healthcare benefits
Employee Stock Purchase Plan (ESPP)
401(k) Retirement Savings Plan
Flexible Spending Accounts
Educational assistance programs
Paid holidays
Paid time off
Paid parental leave
Commuting benefits
Employee Discount Program

Fulltime

Senior Software Engineer, Site Reliability

Babylist is looking for a Senior Software Engineer, Site Reliability to join our...

Location

United States; Canada

Salary:

186818.00 - 224183.00 USD; CAD / Year

Babylist

Expiration Date

Until further notice

Requirements

8+ years of experience as a Site Reliability Engineer or similar role
Experience supporting high-traffic consumer-facing websites
Proficiency with Terraform
Strong experience working with AWS cloud-based infrastructure and services
Proficiency with Docker and Kubernetes
Solid understanding of cloud-native systems design
Troubleshooting and debugging skills
Experience designing and supporting CI systems
Familiar with monitoring and alerting best practices
Proven experience in on-call management best practices

Job Responsibility

Manage and build our AWS infrastructure using Infrastructure as Code (IaC) tools like Terraform
Improve the speed and reliability of our Continuous Integration (CI) systems
Provide support to developers in troubleshooting issues
Establish, communicate, and support best practices for monitoring and alerting

What we offer

Company-paid medical, dental, and vision insurance
Retirement savings plan with company matching and flexible spending accounts
Generous paid parental leave and PTO
Remote work stipend
Perks for physical, mental, and emotional health, parenting, childcare, and financial planning

Fulltime

Senior Software Engineer - Observability and Reliability

We are growing the engineering team and looking for engineers who have the chops...

Location

United States , New York City

Salary:

150000.00 - 220000.00 USD / Year

Sigma Computing

Expiration Date

Until further notice

Requirements

Strong Computer Science fundamentals
5+ years industry experience building and maintaining high-quality software, especially software other engineers use
You apply a product mindset to infrastructure systems and feel accomplished enabling others
Desire to be a great teammate and have fun at work
Strong sense of craftsmanship, and a healthy academic curiosity

Job Responsibility

Build observability tools and platforms, including: metrics, logging, distributed tracing, dashboarding, alerting, application performance management
Build with modern tools and languages like Go, Open Telemetry and Kubernetes
Participate in on-call rotation and ensure uptime of services
Create runtime tools/processes that optimize cloud triaging and limit downtime
Define best practices around making our systems and services measurable
Collaborate with peers and stakeholders through design and code reviews to ensure best practices amongst available technologies. We expect successful candidates to be coding a majority of their time

What we offer

Equity
Generous health benefits
Flexible time off policy. Take the time off you need!
Paid bonding time for all new parents
Traditional and Roth 401k
Commuter and FSA benefits
Lunch Program
Dog friendly office

Fulltime

Senior Site Reliability Engineer

As a Senior Site Reliability Engineer on the Platform team, you will identify is...

Location

United States , Denver; San Francisco

Salary:

138000.00 - 191000.00 USD / Year

Checkr

Expiration Date

Until further notice

Requirements

Degree in Computer Science (or related field)
6+ years of experience in building tools with Python (preferred), GoLang, or Ruby
6+ years of experience in maintaining and observing production customer-facing environments in AWS or Azure
6+ years of experience as a member of an incident response team
Deep understanding of the fundamental infrastructure and platform concepts behind a micro-service architecture, REST APIs, and asynchronous queueing models
Experience with observability platforms and frameworks like Datadog, Splunk, Grafana, Prometheus, or OpenTelemetry
Strong collaboration, documentation, communication, and project management skills
Experience with container orchestration using Kubernetes/Docker/Terraform
Experience driving platform adoption across engineering teams, guided by a self-service and product-first approach
A passion for customer-centricity and building relationships with other teams

Job Responsibility

Collaborate, drive, and execute architectural discussions with cross-functional teams
Lead cross-team projects and SREs' technical roadmap to enable engineering and help Checkr customers
Design, build, ship, and maintain the core observability libraries, tools, and patterns used by all of Checkr’s engineering teams
Proactively engage across teams to foster service reliability, efficiency, and scalability
Troubleshoot complex production issues across the stack, with respect to performance, availability, and data quality
Present detailed technical information and benefits of the Checkr platform to a wide array of customers, including operations, developers, technical architects, and executives

What we offer

A fast-paced and collaborative environment
Learning and development allowance
Competitive cash and equity compensation and opportunities for advancement
100% medical, dental, and vision coverage
Up to $25K reimbursement for fertility, adoption, and parental planning services
Flexible PTO policy
Monthly wellness stipend, home office stipend
In-office perks such as lunch four times a week, commuter stipend, and an abundance of snacks and beverages

Fulltime

Senior Site Reliability Engineer

We are seeking an experienced Senior Site Reliability Engineer (L3) to join our ...

Location

India , Chennai

Salary:

Not provided

Arcadia

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
8–10+ years of experience in SRE/DevOps/Cloud Engineering, with deep hands-on exposure to AWS and Kubernetes
Strong hands-on experience with: Terraform & Infrastructure as Code
AWS core services (EKS, IAM, RDS, EC2, VPC, CloudWatch, CloudTrail, GuardDuty)
Jenkins + Groovy, GitHub Actions, ArgoCD, FluxCD
Kubernetes troubleshooting and operations
Prometheus/Grafana/Datadog observability stacks
Proven ability to operate in high-scale, high-uptime, multi-environment production systems
Experience building automation via Python/Bash and reducing operational toil
Strong understanding of incident management, root cause analysis, and reliability engineering principles

Job Responsibility

Design, build, and maintain AWS infrastructure (EKS, VPC, RDS, IAM, CloudWatch, CloudTrail, GuardDuty, Load Balancers, S3, CloudFront) using Terraform and CloudFormation
Lead all aspects of Kubernetes operations including cluster upgrades, performance tuning, CNI troubleshooting, workload scaling, Helm chart packaging, and GitOps deployments
Own and evolve our CI/CD ecosystem across Jenkins (Groovy scripting), GitHub Actions, AWS CodePipeline, ArgoCD, and FluxCD
Improve platform reliability by reducing operational toil through automation, scripting (Python/Bash), and proactive system hardening
Implement and enhance observability across Prometheus, Grafana, Loki, Tempo, Datadog, and CloudWatch—ensuring actionable alerting, dashboards, and metrics alignment with SLO/SLIs
Drive FinOps initiatives, identifying cost inefficiencies and working with engineering teams to implement best practices, tagging standards, budgeting, and resource right-sizing
Manage database operations across MySQL and PostgreSQL including backups, performance tuning, replication, and operational runbooks
Maintain and improve secret management using Vault, AWS Secrets Manager, and Parameter Store
Strengthen cloud security posture with IAM least privilege, CSPM reviews, audit readiness, GuardDuty/CloudTrail monitoring, and environment hardening
Troubleshoot complex production issues across networking, Kubernetes, compute, databases, and CI/CD systems

What we offer

Competitive compensation and employee stock options
Hybrid/remote-first working model (India-based role, with global collaboration)
Flexible leave policy
Comprehensive medical insurance (self + family members)
Annual performance cycle + quarterly recognition awards
A supportive, diverse engineering culture grounded in empathy, teamwork, and innovation

Fulltime

Senior Service Reliability Engineer

Plusnet

Location:
India , Gurugram

Category:
IT - Administration

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:
January 13, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Service Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Software Engineer, Site Reliability

Senior Software Engineer - Observability and Reliability

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Service Reliability Engineer

Plusnet

Location:India , Gurugram

Category:IT - Administration

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Additional Information:

Job Posted:January 13, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Senior Service Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Software Engineer, Site Reliability

Senior Software Engineer - Observability and Reliability

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Location:
India , Gurugram

Category:
IT - Administration

Contract Type:
Not provided

Job Posted:
January 13, 2026