Senior Devops/SRE Engineer Job at N-iX

Senior Software Engineer - SRE

Roku is changing how the world watches TV. Roku is the #1 TV streaming platform ...

Location

India , Bengaluru

Salary:

Not provided

Roku

Expiration Date

Until further notice

Requirements

Preferably 8+ years of experience in DevOps/SRE roles, with demonstrated expertise in implementing SRE principles, SLO/SLI frameworks, and error budget policies in production environments
Deep experience with observability and monitoring platforms such as Prometheus, Grafana, Datadog, New Relic, or equivalent, including experience building custom dashboards, alerts, and SLO-based monitoring
Strong background in incident management, including experience as an Incident Commander, conducting blameless postmortems, and implementing systematic reliability improvements based on incident learnings
Strong understanding of distributed systems and reliability engineering, including failure modes, fault tolerance patterns, circuit breakers, bulkheads, rate limiting, and graceful degradation strategies
Experience with a number of the following: Kubernetes, Docker, Service Mesh such as Istio, Envoy, Linkerd, Solo & ECS
Experience in cloud-focused software development, preferably in Go, Python, or other object-oriented programming languages
Experience with Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation
Experience with CI/CD automation, including GitLab pipelines and other related tools
Strong hands-on experience with cloud platforms such as AWS, GCP or Azure
Proven track record of implementing scalable, high-performance infrastructure solutions in fast-paced, dynamic environments

Job Responsibility

Design & Infrastructure
Contribute to postmortem culture by facilitating comprehensive, blameless post-incident reviews that identify root causes, contributing factors, and actionable remediation items. Track incident trends to identify systemic issues and prioritize reliability improvements
Implement chaos engineering practices to proactively identify failure modes, validate system resilience, and build confidence in recovery procedures. Conduct game days and disaster recovery exercises
SRE Process & Principles Implementation
Deploy and evolve SRE practices across the organization by establishing core SRE principles, frameworks, and methodologies. Define and implement service reliability practices, including Service Level Objectives (SLOs), Service Level Indicators (SLIs), and Error Budgets, to balance innovation velocity with system reliability
Manage Error Budgets as a mechanism for making data-driven decisions about feature velocity vs. reliability. Track, report, and enforce error budget policies, facilitating conversations between engineering and product teams about risk tolerance and release decisions
Reliability Engineering & Infrastructure
Reduce toil through automation by identifying repetitive operational work and systematically eliminating it through infrastructure-as-code, automation frameworks, and intelligent tooling. Measure and track toil reduction efforts, aiming to keep toil below 50% of team time
Implement capacity planning processes that ensure systems have adequate headroom to meet SLOs during peak traffic, unexpected load spikes, and degraded states. Develop predictive models and automated scaling mechanisms
Observability, Monitoring & Reporting

What we offer

global access to mental health and financial wellness support and resources
healthcare (medical, dental, and vision)
life, accident, disability, commuter, and retirement options (401(k)/pension)
time off in accordance with local leave policies

Fulltime

Senior DevOps Engineer

We are looking for a highly skilled Senior DevOps Engineer to design, implement,...

Location

India , Ahmedabad

Salary:

Not provided

Codezeros

Expiration Date

Until further notice

Requirements

7+ years of experience in DevOps/SRE roles
Strong hands-on experience with both AWS and Azure
Expertise in Linux system administration
Proficiency in scripting (Bash, Python, or Go)
AWS Services: EC2, S3, RDS, Lambda, VPC, IAM
Azure Services: VM, Blob Storage, Azure Functions, VNet, Azure AD
CI/CD pipelines and automation
Infrastructure as Code (Terraform preferred)
Containerization and orchestration (Docker + Kubernetes)
Networking fundamentals (DNS, TCP/IP, Load Balancing)

Job Responsibility

Design and manage scalable, fault-tolerant systems on AWS and Azure
Optimize cloud cost, performance, and security
Implement high availability, disaster recovery, and backup strategies
Build and maintain CI/CD pipelines using tools like (Jenkins, Github Actions, Azure DevOps)
Automate build, test, and deployment processes
Develop and manage infrastructure using (Terraform, AWS CloudFormation, Azure Resource Manager)
Ensure version-controlled and reusable infrastructure modules
Manage containerized workloads using (Docker, Kubernets)
Deploy and manage clusters on (AWS EKS, Azure AKS)
Implement observability solutions using (Prometheus, Grafana, ELK Stack, Cloud-native tools (CloudWatch, Azure Monitor))

Fulltime

Senior Software Engineer (Cloud & DevOps)

At 3Shape, we use cloud platforms to deliver secure, reliable services to both i...

Location

Denmark , Copenhagen

Salary:

Not provided

3Shape

Expiration Date

Until further notice

Requirements

5 years of experience, of which Minimum 3 years of professional C#/.NET backend development experience, ideally in a cloud environment.
Minimum 3 years of hands-on DevOps/SRE experience, ideally in a role combining software development and operations.
Strong backend engineering fundamentals (design, performance, security, and maintainability).
Experience with API design, automated testing, code reviews, and building maintainable systems.
Experience with containerized workloads and Kubernetes (e.g., Azure Kubernetes Service).
Curiosity for modern engineering practices and a strong understanding of core Azure concepts (networking, compute, storage, identity, and databases).
Experience with monitoring/observability in Azure (e.g., Azure Monitor, Application Insights, Log Analytics) and incident handling is a plus.
A strong ownership mindset: automation-first, focus on reliability, and continuous improvement of quality and stability.

Job Responsibility

Design, implement, and maintain backend services in the Account domain, delivering features end-to-end from implementation and testing to deployment readiness.
Be the team's primary point of contact for DevOps topics and drive improvements across CI/CD, AKS/Kubernetes, Infrastructure as Code, observability, and platform stability.
Collaborate with platform and product teams across 3Shape to align on Azure standards and best practices especially around Infrastructure as Code, observability, and operational readiness.
Help define actionable alerts and dashboards, improve runbooks, and build safe automation so incidents can be detected, triaged, and mitigated quickly even outside normal working hours.

What we offer

Central Copenhagen location
An attractive healthcare package to keep you fit and well.
Breakfast every day, and a delicious and healthy lunch cooked by our private chefs.
A joint purpose: to enable dentists to provide superior dental care to every patient, every time.

Fulltime

Senior Automation Engineer

Cloud Technology Services – External Job Description. About Citi Cloud Technolog...

Location

India , Pune

Salary:

Not provided

Citi

Expiration Date

Until further notice

Requirements

6–10 years of experience with Windows and RHEL server platforms and system integration
Strong background in enterprise storage engineering or administration
Expertise in DevOps tooling: Jenkins, Git, Ansible
CI/CD integration experience
Hands‑on experience with Dell‑EMC, Hitachi storage arrays, Cisco MDS switches, and Emulex HBAs
Strong written and verbal communication skills
ability to work in a global, matrixed environment
Experience with Agile, DevOps/SRE methodologies, and automation scripting (PowerShell, Ansible)
Proven ability to independently manage multiple tasks and drive engineering improvements
Ability to collaborate with teams across global time zones

Job Responsibility

Engineer, integrate, and maintain CI/CD pipelines for enterprise storage software from Dell‑EMC, Hitachi, Cisco MDS, Emulex, and Veritas
Evaluate and certify HBA cards, drivers, and firmware for Windows Server (2016–2025), RHEL 8/9/10, and VMware ESX
Diagnose and resolve SAN stack compatibility issues impacting OS upgrades across Dell and HPE servers
Produce clear, validated engineering documentation and updated SAN Stack standards
Automate testing and configuration validation using Ansible and PowerShell
Support engineering teams with cluster testing for next‑generation Emulex HBAs (16/32Gb)
Lead technical discussions with Emulex and other vendors to investigate vulnerabilities and enhance driver/firmware stability
Develop automated test plans using Citi’s internal integration test environment
Raise deployment requests for validated driver/firmware packages and maintain internal validation scripts
Ensure all work meets audit, compliance, and information security requirements

Fulltime

Senior DevOps Engineer

We are seeking a highly skilled Senior DevOps Engineer with 5+ years of experien...

Location

Spain , Barcelona

Salary:

Not provided

Abacum Inc

Expiration Date

Until further notice

Requirements

5+ years of experience in DevOps/SRE (or a comparable role)
Deep knowledge of cloud systems, specially AWS
Designing, building, and maintaining CI/CD pipelines using cloud-native tools like GitHub Actions
Strong proficiency in IaC (Terraform is a plus)
Strong proficiency building and scaling Kubernetes clusters
Expertise in Bash/Posix and a high-level language like Python for automation, API interaction, and custom tooling
Strong proficiency architecting monitoring system (Datadog is a plus)
Versatile (hands-on doer at ease from engineering strategy to low level details, starter attitude)
Empathetic and humble (knows how to speak up with respect, capacity to compromise and how to commit in disagreement when needed)

Job Responsibility

Design and implement our systems to be efficient, scalable, accountable, and secure
Team up with other Engineers to perform experiments and test new ideas
Build a strong DevOps culture and tooling that enable our delivery teams to be autonomous while providing best practices (security, observability, scalability, performance, etc.)
Deploy and manage our infrastructure provisioning
Develop and drive real time observability solutions that provide visibility into system health
Provide technical guidance and educate team members and coworkers on operations and cloud best practices
Continuously improve development delivery CI/CD
Ability to develop and implement security measures related to the development processes and operational needs driven by our security and compliance team
Build and scale our Kubernetes clusters and workloads
Manage and scale our cloud databases

What we offer

Competitive compensation including equity package
Competitive vacation policy
Access to Meditopia
Hybrid working model and flexible working hours
Personal development including language courses

Fulltime

Senior Platform Engineer

We’re hiring a Senior Platform Engineer to join the Platform team. This is a str...

Location

United States , Remote; Honolulu; San Diego; Seattle; Colorado Springs; Austin; Washington; Tampa

Salary:

180000.00 - 230000.00 USD / Year

Onebrief

Expiration Date

Until further notice

Requirements

5+ years in Platform, DevOps, or Site Reliability Engineering with an infrastructure and operations focus
Proven partner to DevOps/SRE and application teams
collaborates well across functions and shares context openly
Clear, concise writing
strong documentation habits and async communication
Infrastructure as Code: Terraform (or CloudFormation), Ansible
Containers and orchestration: Docker
Kubernetes design, deployment, and operations
CI/CD: experience building and maintaining pipelines (GitLab CI/CD, Jenkins, GitHub Actions)
Scripting: proficiency with at least one of Python, Go, or Bash

Job Responsibility

Building and automating the platform: Design, provision, and manage cloud and on‑prem environments with Terraform and Ansible
operate secure, resilient Kubernetes clusters for multi‑level deployments
Designing and evolving CI/CD: Reduce lead time and maintain strict gates and environment parity by improving pipelines in GitLab CI/CD, Jenkins, and GitHub Actions
Upholding reliability, security, and compliance: Partner with SREs to implement monitoring, logging, and alerting (Prometheus, Grafana, ELK/Datadog)
embed RMF and STIG controls into automation and day‑to‑day operations
Operating and optimizing our footprint: Manage AWS/Azure/GCP resources and on‑prem networks. Balance cost, performance, and security while unblocking product teams
Supporting the team, documenting, and mentoring: Triage and resolve complex platform issues
write clear docs for architectures and runbooks
share context and mentor teammates

What we offer

Flexible Work Environment: Remote work with flexible hours and unlimited PTO
Comprehensive Health Coverage: Health, dental, vision, and life insurance
Retirement Plan: 401(k) plan to secure your future
Parental Leave: 8 weeks at 100% regardless of state
Company Retreats: Annual company summit trips
Home Office Budget: $1,000 per year for home office improvements
Offers Equity

Fulltime

Senior Platform Engineer

We’re building a world-class engineering organization, and this role is a founda...

Location

United States , New York

Salary:

170000.00 - 205000.00 USD / Year

CampusGroup

Expiration Date

Until further notice

Requirements

5–10 years of experience as a backend engineer, platform engineer, DevOps/SRE engineer, or data engineer
Experience building and maintaining production services, pipelines, and cloud infrastructure at scale
Strong in Go and designing backend systems for reliability and clarity
Hands-on experience with GCP, Terraform, Docker, and CI/CD tooling
Deep care about reliability, performance, graceful degradation, and observability
Experience with data pipelines, warehouses, or operational data workflows
Curious and growth-oriented, always learning, improving, and raising the engineering bar
Communicates clearly and works collaboratively with low ego

Job Responsibility

Ship backend systems that power admissions, financial aid, registrar workflows, reporting, and real-time student experiences
Build and maintain data pipelines and ingestion flows supporting analytics, product, and operational automation
Own CI/CD and deployment workflows to ensure fast, reliable releases across backend, mobile, and frontend
Design and scale our cloud infrastructure using GCP and Terraform
Improve developer experience and data scientist experience — tools, environments, testing workflows, and internal automations
Partner across engineering (backend, mobile, data, product) to accelerate delivery and reduce operational friction
Strengthen security & compliance through better access control, audit-ready logging, secrets management, and hardened infrastructure

What we offer

Medical, dental, and vision insurance
401(k) match
Fertility benefits via Carrot
Flexible Time Away + paid holidays
In-office lunches for our NY Office
Hybrid work schedule (Mon & Fri remote
Tues-Thurs in-office)
Social events - happy hours, birthday celebrations, holiday parties, & more!
Opportunity to make an impact – you’ll be an integral player in bringing our vision to life

Fulltime

Senior Software Engineer, Devops

We are seeking a talented and experienced DevOps/SRE (Site Reliability Engineeri...

Location

India , Bengaluru

Salary:

Not provided

Roku

Expiration Date

Until further notice

Requirements

8+ years of experience in DevOps/SRE roles
Experience in cloud-focused software development, preferably in Go, Python, or other object-oriented programming languages
Experience with a number of the following: ECS, Docker, Kubernetes, Envoy, Istio, Linkerd, Solo
Experience with Infrastructure as Code (IaC) tools such as Terraform, Ansible, or CloudFormation
Strong understanding of distributed systems, microservices architecture, and cloud-native technologies
Strong proficiency in cloud platforms such as AWS, Azure, or GCP
Solid understanding of networking, security, and compliance principles
Proven track record of driving results and delivering high-quality solutions in a fast-paced environment
Demonstrated ability to communicate clearly with both technical and non-technical project stakeholders
BS Degree in Computer Science or Equivalent

Job Responsibility

Oversee the design, implementation, and maintenance of scalable and resilient cloud infrastructure on platforms spanning AWS and GCP
Ensure high availability, reliability, and performance of critical systems
Collaborate with your peers to be responsible for the entire software lifecycle
Manage individual project priorities, deadlines, and deliverables related to your technical expertise and assigned domains
Lead incident response efforts
Implement effective incident management processes and post-incident reviews
Collaborate with security teams to ensure the integrity and security of infrastructure and applications
Identify performance bottlenecks and optimize system resources for maximum efficiency
Conduct regular performance tuning and capacity planning exercises
Drive continuous improvement initiatives within the team and across the organization

What we offer

Global access to mental health and financial wellness support and resources
Local benefits include statutory and voluntary benefits which may include healthcare (medical, dental, and vision), life, accident, disability, commuter, and retirement options (401(k)/pension)
Vacation and other personal time off

Fulltime

Select Country

Senior Devops/SRE Engineer

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?