CrawlJobs Logo

Sr sre

https://www.randstad.com Logo

Randstad

Location Icon

Location:
India , Putlibowli

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Responsibility:

  • Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, Ansible, Dynatrace to automate deployment and management of infrastructure
  • Build and manage CI/CD pipelines to ensure efficient and reliable application deployments
  • Improve infrastructure provisioning and configuration through automation, minimizing manual interventions and reducing human error
  • Monitor the health, performance, and reliability of production systems and applications
  • Design, implement, and maintain automated monitoring solutions, using tools such as Datadog
  • Define and monitor service level objectives (SLOs), service level indicators (SLIs), and error budgets to ensure system reliability and availability meet customer expectations
  • Implement effective alerting systems to identify and address potential issues before they impact users
  • Lead root cause analysis (RCA) and post-mortem investigations after incidents to identify improvements and avoid recurrence
  • Respond to production incidents, diagnose root causes, and implement corrective actions
  • Create and maintain playbooks and documentation for incident response, troubleshooting, and recovery processes
  • Collaborate closely with development teams during the post-deployment phase to ensure smooth rollouts and address any production issues
  • Work alongside software engineers to design, deploy, and scale applications that are highly available, resilient, and fault tolerant
  • Provide guidance and support in ensuring that code is written with an operational mindset, enabling easy deployment, monitoring, and debugging
  • Act as a bridge between development, operations, and business teams, ensuring that infrastructure and software align with business goals
  • Experience working with cloud platforms such as AWS, Microsoft Azure and/or GCP
  • Expertise with Git, Jenkins, CircleCI, GitLab CI, or similar CI/CD platforms
  • Stay current with emerging technologies, tools, and trends in site reliability engineering, DevOps, and cloud computing
  • Lead or contribute to internal initiatives aimed at improving system performance, reliability, and operational efficiency
  • Propose and lead process improvements, optimizations, and innovations in automation and system design
  • Strong written and verbal communication skills, able to collaborate with cross-functional teams, write documentation, and explain technical concepts to non-technical stakeholders
  • Ability to work effectively in a fast-paced environment, collaborating with software developers, other SREs, operations teams, and business stakeholders

Requirements:

  • Develop and maintain Infrastructure as Code (IaC) using tools like Terraform, Ansible, Dynatrace
  • Build and manage CI/CD pipelines
  • Improve infrastructure provisioning and configuration through automation
  • Monitor the health, performance, and reliability of production systems and applications
  • Design, implement, and maintain automated monitoring solutions, using tools such as Datadog
  • Define and monitor service level objectives (SLOs), service level indicators (SLIs), and error budgets
  • Implement effective alerting systems
  • Lead root cause analysis (RCA) and post-mortem investigations
  • Respond to production incidents, diagnose root causes, and implement corrective actions
  • Create and maintain playbooks and documentation for incident response
  • Collaborate closely with development teams
  • Work alongside software engineers to design, deploy, and scale applications
  • Provide guidance and support in ensuring that code is written with an operational mindset
  • Act as a bridge between development, operations, and business teams
  • Experience working with cloud platforms such as AWS, Microsoft Azure and/or GCP
  • Expertise with Git, Jenkins, CircleCI, GitLab CI, or similar CI/CD platforms
  • Stay current with emerging technologies, tools, and trends in site reliability engineering, DevOps, and cloud computing
  • Lead or contribute to internal initiatives aimed at improving system performance, reliability, and operational efficiency
  • Propose and lead process improvements, optimizations, and innovations in automation and system design
  • Strong written and verbal communication skills
  • Ability to work effectively in a fast-paced environment, collaborating with software developers, other SREs, operations teams, and business stakeholders

Additional Information:

Job Posted:
March 09, 2026

Expiration:
March 30, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Sr sre

Applications Development Sr Programmer Analyst

Integration Services within Common Platform Engineering is responsible for devel...
Location
Location
Canada , Mississauga
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience working in Financial Services or a large complex and/or global environment
  • Experience of the following technologies: Kafka Ecosystem (Confluent distribution preferred)
  • Kubernetes and Openshift
  • Java
  • React
  • Familiarity with SRE practices
  • Consistently demonstrates clear and concise written and verbal communication
Job Responsibility
Job Responsibility
  • Designing and developing workflow solutions to integrate Kafka with our data governance and control platforms
  • Understanding the existing onboarding flow and working to streamline and simplify the process
  • Design and develop developer facing tooling to manage topics and connectors
  • Help to deliver the SRE requirements for this stack
  • Fulltime
Read More
Arrow Right

Sr./Staff - Infrastructure/Site Reliability Engineer (SRE)

Shape the future of trust in the age of AI. At Oscilar, we're building the most ...
Location
Location
United Kingdom
Salary
Salary:
Not provided
oscilar.com Logo
Oscilar
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven track record as a senior SRE or Infrastructure Engineer in high-scale environments
  • Expert-level skills in AWS and Infrastructure as Code (Pulumi, Terraform)
  • Strong programming ability in Go or Python. We use Go
  • Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture
  • Mastery of container orchestration (Kubernetes) and production debugging
  • Strong sense of ownership, and the judgment to balance velocity with reliability
Job Responsibility
Job Responsibility
  • Architect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes)
  • Lead initiatives to improve availability, latency, and performance at scale
  • Design and evolve our CI/CD pipelines to optimize for speed, safety, and repeatability
  • Define the metrics, alerts, and runbooks that form our observability backbone
  • Run chaos experiments and failure simulations to harden the platform
  • Mentor engineers and set best practices for SRE across the company
  • Fulltime
Read More
Arrow Right

Sr./Staff - Infrastructure/Site Reliability Engineer (SRE)

Shape the future of trust in the age of AI. At Oscilar, we're building the most ...
Location
Location
Poland
Salary
Salary:
Not provided
oscilar.com Logo
Oscilar
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven track record as a senior SRE or Infrastructure Engineer in high-scale environments
  • Expert-level skills in AWS and Infrastructure as Code (Pulumi, Terraform)
  • Strong programming ability in Go or Python. We use Go
  • Deep understanding of distributed systems (Kafka, ClickHouse) and microservices architecture
  • Mastery of container orchestration (Kubernetes) and production debugging
  • Strong sense of ownership, and the judgment to balance velocity with reliability
Job Responsibility
Job Responsibility
  • Architect and operate resilient cloud infrastructure (AWS, Pulumi, Kubernetes)
  • Lead initiatives to improve availability, latency, and performance at scale
  • Design and evolve our CI/CD pipelines to optimize for speed, safety, and repeatability
  • Define the metrics, alerts, and runbooks that form our observability backbone
  • Run chaos experiments and failure simulations to harden the platform
  • Mentor engineers and set best practices for SRE across the company
  • Fulltime
Read More
Arrow Right

Senior Release Train Manager - IT Infrastructure

The Senior Release Train Manager (RTM) is a servant leader and coach that is res...
Location
Location
United States
Salary
Salary:
139000.00 - 186000.00 USD / Year
zelis.com Logo
Zelis
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree or equivalent work experience required
  • Must have 8+ years of experience as a Senior Scrum Master or Release Train Engineer
  • Minimum 5+ years supporting IT Infrastructure teams and projects, such as cloud engineering, data center operations, database engineering, network engineering, ITSecOps, desktop engineering, EUC, SRE, etc.
  • 5+ years of experience as a people manager is required
  • Agile Tools experience (i.e. Jira, Confluence, TFS, DevOps, etc.)
  • Thorough knowledge of Scrum framework and application of core Agile values and principles in managing Product Development and Delivery, SAFe Agile preferred
  • Experience in successfully delivering medium/large size projects with minimal direction
  • SAFe RTE and CSM certifications are strongly preferred
  • 5+ years of experience with Jira and/or Jira Align is strongly preferred
Job Responsibility
Job Responsibility
  • Lead, hire, mentor, and scale a high-performing team of Scrum Masters & RTE's to support the IT Infrastructure ART and the Enterprise Applications ART
  • Leading ART PI Planning, program execution, scrum, and ART synch events
  • Coaching Scrum Teams in relentless improvement
  • Creating transparency using metrics to drive positive change
  • Facilitating resolution of dependencies, impediments, and risks amongst the scrum teams
  • Building high-performing teams – Focus on ever-improving team dynamics and performance
  • Enabling organizational effectiveness – Work with other stakeholders to help the team contribute towards improving the overall development Value Stream
  • Exhibits Lean-Agile leadership – Exhibits the behaviors of a Lean-Agile Leader with a Lean-Agile Mindset. Helps the team embrace SAFe Core Values, adopt, and apply SAFe Principles, and implement SAFe practices
  • Facilitates the team’s progress toward team goals – Trained as a team facilitator and is continuously engaged in challenging the old norms of development to improve performance in the areas of quality, predictability, flow, and velocity. Assisting the product owner(s) in maintaining and prioritization of the product backlog. Helps the team focus on daily and Sprint Goals in the context of the current Program Increment (PI) Objectives
  • Leads team efforts in relentless improvement – Helps the team improve and take responsibility for their actions
What we offer
What we offer
  • 401k plan with employer match
  • flexible paid time off
  • holidays
  • parental leaves
  • life and disability insurance
  • health benefits including medical, dental, vision, and prescription drug coverage
  • Fulltime
Read More
Arrow Right

Sr Platformization/Cloud Automation Engineer

Palo Alto Networks CDSS group is looking for a seasoned platformization and clou...
Location
Location
United States , Santa Clara
Salary
Salary:
104600.00 - 169225.00 USD / Year
paloaltonetworks.it Logo
Palo Alto Networks Italia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelors/Masters degree in Computer Science or a related field
  • 5+ years of industry experience in engineering
  • Fluent scripting skills (preferably Python or Bash) with deep experience in Unix/Linux systems from kernel to shell and beyond
  • 4+ years of working with Microservices architectures on Kubernetes
  • HandsOn experience with container native tools like Docker, Helm for managing workloads running in Kubernetes
  • Experience managing AWS and GCP at scale, with knowledge of cloud-neutral connectivity between platforms
  • Experience designing and maintaining API specifications using Swagger/OpenAPI, and working with API frameworks such as Apigee to enable secure, scalable integrations
  • HandsOn experience with infrastructure-as-code and automation tools such as Terraform, Ansible, etc.
  • Proficient in CI/CD platforms like GitlabCI, Jenkins, ArgoCD, CircleCI etc.
  • In-depth knowledge of operating systems (processes, threads, concurrency, etc)
Job Responsibility
Job Responsibility
  • Work with development teams to ensure that applications have scalability and reliability built-in from day one
  • Design, review and enhance software architecture to improve scalability, service reliability, cost, and performance
  • Drive platformization by building standardized, self-service infrastructure platforms that improve developer productivity, scalability, and operational efficiency
  • Deploy automation for provisioning and operating infrastructure at large scale
  • Partner with teams to improve CI/CD processes and technology
  • Mentor members of the staff on large scale cloud deployments
  • Drive the adoption of observability practices and a data-driven mindset
  • Setup processes like on-call rotations, Postmortems, Run books to continue supporting the infrastructure owned by the SRE team while finding ways to reduce the time to resolution and improve the reliability of services
  • Support, optimize and deploy mission critical, front-end and back-end production
  • Improving site performance, monitoring, and overall stability of our infrastructure
  • Fulltime
Read More
Arrow Right

Sr. Java/ Kotlin Full Stack Engineer

Piper Companies is seeking a Full Stack Engineer to join a leader in the financi...
Location
Location
United States , Cary
Salary
Salary:
142000.00 USD / Year
pipercompanies.com Logo
Piper Companies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science or related field
  • Strong proficiency in Kotlin (Java acceptable), TypeScript, React/Angular, and Spring Boot
  • Hands-on experience with REST API development and micro-frontend architecture
  • Familiarity with test-driven development and secure coding practices
  • Excellent problem-solving skills and ability to work in a collaborative, agile environment
Job Responsibility
Job Responsibility
  • Develop and maintain applications using Kotlin (or strong Java experience), TypeScript, and modern frameworks such as React or Angular
  • Design and implement RESTful APIs for secure and scalable integrations
  • Collaborate with cross-functional squads including DevOps and SRE teams to deliver high-quality software solutions
  • Participate in code reviews and follow test-driven development practices
  • Support modernization efforts and migration of applications from Python to Kotlin
What we offer
What we offer
  • Medical, Dental, Vision, 401k, PTO, holidays, sick leave as required by law
  • 3% bonus
  • Fulltime
Read More
Arrow Right

Sr Software Engineer - ServiceNow ITOM

This role is essential for designing, implementing, and deploying scalable softw...
Location
Location
United States , Bellevue; Frisco
Salary
Salary:
113600.00 - 205000.00 USD / Year
https://www.t-mobile.com Logo
T-Mobile
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree Computer Science or Engineering (Required)
  • 4-7 years - Technical engineering experience.
  • Communication (Required)
  • Customer Service (Required)
  • Analytics (Required)
  • Technical Writing (Required)
  • At least 18 years of age
  • Legally authorized to work in the United States
Job Responsibility
Job Responsibility
  • Design, build, and enhance ServiceNow ITOM solutions—including Discovery patterns, Service Mapping logic, automation workflows, federations/integrations, and operational dashboards—to ensure accurate, high-quality, and scalable outcomes. Conduct unit, integration, and platform testing to validate ITOM functionality and maintain high standards of reliability.
  • Drive architectural improvements across the ITOM stack by applying modern engineering practices, new ServiceNow capabilities, and cloud-native pattern methodologies. Recommend and implement enhancements that strengthen CMDB integrity, accelerate root cause analysis, and improve service health visibility.
  • Partner with cross-functional groups—including ITOM, CMDB, SRE, network, cloud, and application teams—to gather requirements, deliver aligned solutions, and ensure platform adoption. Provide mentorship and documentation that elevates team capability in areas such as Discovery tuning, CMDB architecture, integration patterns, and operational automation.
  • Support technology strategy by evaluating and applying current technologies that align with business goals
  • Create clear documentation for software code, system designs, and business requirements to support knowledge sharing
  • Provide additional engineering support for adjacent ServiceNow modules (e.g., CMDB, CSDM, ITAM, or ITOM submodules) and contribute to enterprise automation or operational improvements as required.
What we offer
What we offer
  • medical, dental and vision insurance
  • a flexible spending account
  • 401(k)
  • employee stock grants
  • employee stock purchase plan
  • paid time off
  • up to 12 paid holidays
  • paid parental and family leave
  • family building benefits
  • back-up care
  • Fulltime
Read More
Arrow Right
New

General Operative

We are looking for a reliable General Operative to join our production team in a...
Location
Location
United Kingdom , Swindon
Salary
Salary:
12.00 - 13.00 GBP / Hour
https://www.randstad.com Logo
Randstad
Expiration Date
April 27, 2026
Flip Icon
Requirements
Requirements
  • GCSE Maths and English (Grade C/4 or above) or equivalent
  • Previous experience in a regulated industry (Food, Pharma, Auto, or Engineering) is preferred but not essential
Job Responsibility
Job Responsibility
  • Cleanroom Support: Following strict gowning and hygiene procedures to enter production areas
  • Production Tasks: Assisting with packaging, labeling, and weighing products
  • Maintenance: Cleaning equipment and production areas according to standard operating procedures (SOPs)
  • Material Handling: Moving stock, checking expiry dates, and loading equipment like washers or autoclaves
  • Documentation: Keeping accurate, clear records of all work completed
Read More
Arrow Right