CrawlJobs Logo

Site Reliability Engineer III

zuora.com Logo

Zuora

Location Icon

Location:
India , Chennai

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Zuora’s Cloud Engineering teams are responsible for Cloud infrastructures, monitoring performance and uptime, managing internal and external shared services, infrastructure services and more -for Zuora’s customer facing SaaS products and platforms. Our technologists sit across US, Beijing, India, Costa Rica and remotely, using a follow-the-sun model to provide 24x7x365 coverage for critical functions and partner closely with our Engineering, Customer Support, Security, Global Services and Sales teams on a daily basis to keep our customers front and center. We are seeking a highly skilled and motivated Site Reliability Engineer (SRE) to join our infrastructure team. The ideal candidate will be focused on maximizing system uptime, efficiency, and reliability while building the tools and automation necessary to scale our services. This role requires a strong balance of operational experience and development skills, with deep expertise in cloud environments and modern CI/CD practices. This is a location specific position that requires you to come into the office regularly to be most effective.

Job Responsibility:

  • Maintain and improve the reliability, scalability, and performance of our production systems, targeting a high-availability environment
  • Design, implement, and maintain automation solutions for infrastructure provisioning, deployment, configuration management, and monitoring using Terraform and Jenkins
  • Administer, manage, and optimize our cloud infrastructure primarily hosted on AWS, focusing on cost efficiency and secure operations
  • Develop and maintain infrastructure-as-code using Puppet and/or Ansible to ensure consistent and reproducible environments
  • Participate in on-call rotation, troubleshoot and resolve critical production incidents, and conduct comprehensive post-mortems to prevent recurrence
  • Apply strong Linux administration skills to manage, patch, and secure operating systems and underlying infrastructure
  • Manage and optimize distributed messaging systems, specifically Kafka, ensuring high throughput and data integrity

Requirements:

  • 6-8 years of relevant experience on SRE/DevOps
  • Proven hands-on working experience with core AWS services (e.g., EC2, VPC, S3, RDS, IAM, CloudWatch, EKS/ECS)
  • Deep expertise in infrastructure-as-code principles using Terraform for provisioning and state management
  • Expert-level knowledge and practical experience with configuration management tools such as Puppet and/or Ansible
  • Strong experience setting up, maintaining, and enhancing Continuous Integration/Continuous Deployment pipelines using Jenkins
  • Proficiency in scripting languages, particularly Python and/or Shell scripting, for developing automation tools and performing system administration tasks
  • Advanced knowledge of Linux operating systems, including performance tuning, troubleshooting, security, and networking fundamentals
  • Working knowledge and operational experience with distributed messaging queues, specifically Kafka

Nice to have:

  • Experience with containerization technologies like Docker and Kubernetes (EKS)
  • Familiarity with logging and monitoring tools (e.g., Prometheus, Grafana, ELK stack)
  • Knowledge of networking (TCP/IP, Load Balancing, DNS)
  • Previous experience in a 24/7 high-availability production environment
What we offer:
  • Competitive compensation, variable bonus and performance reward opportunities, and retirement programs
  • Medical Insurance
  • Generous, flexible time off
  • Paid holidays, “wellness” days and company wide end of year break
  • Learning & Development stipend
  • Opportunities to volunteer and give back, including charitable donation match
  • Free resources and support for your mental wellbeing

Additional Information:

Job Posted:
January 13, 2026

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Site Reliability Engineer III

Project Engineer III

We are seeking an experienced Civil Engineer specializing in water treatment pro...
Location
Location
United States , Columbus
Salary
Salary:
Not provided
wesslerengineering.com Logo
Wessler Engineering
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A minimum of 4 years of experience in water treatment engineering and design
  • Bachelor of Science Degree in Civil Engineering from an ABET accredited institution
  • Professional Engineer (PE) license preferred or the ability to obtain within 12 months
  • Team-oriented with good communication and organizational skills
  • Strong analytical and problem-solving skills with a focus on technical innovation
  • Ability to successfully complete tasks independently with guidance from managers
  • Proficiency in industry-standard computer software for engineering analysis and design (e.g., AutoCAD, Water CAD, etc.)
Job Responsibility
Job Responsibility
  • Develop, evaluate, and design water treatment systems, including conceptual planning, detailed engineering, and process optimization
  • Conduct hydraulic and process modeling to ensure the efficiency and reliability of treatment systems
  • Collaborate with multidisciplinary teams to integrate treatment systems with broader infrastructure projects
  • Provide technical expertise and guidance during the construction and commissioning of water treatment facilities
  • Analyze system performance, troubleshoot issues, and propose solutions for operational improvements
  • Stay updated on emerging technologies and advancements in water treatment processes
  • Prepare technical reports, specifications, and proposals
  • Participate in site visits to evaluate facilities and processes, meet with operations staff, and collect system information useful for evaluation and design
  • Ensure compliance with local, state, and federal regulations
Read More
Arrow Right

Site Reliability Engineer III

The Site Reliability Engineer is responsible for designing, developing, and main...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate degree OR 6 to 10 years of Computer Science, IT or related field experience OR
  • Master’s degree and 7 to 10 years of Computer Science, IT or related field experience OR
  • Bachelor’s degree and 8 to 12 years of Computer Science, IT or related field experience
  • Working experience with various cloud services on AWS (Azure, GCP) and containerization technologies (Docker, Kubernetes)
  • Strong programing skills in languages such as Python
  • Working experience of infrastructure as code (IaC) tools (Terraform, CloudFormation)
  • Working experience with monitoring and alerting tools (Prometheus, Grafana, etc.)
  • Working experience with DevOps/MLOps practice and CI/CD pipelines
  • Proficiency in automated testing tools and frameworks (e.g., Selenium, JUnit, pytest), Incident Management, Production Issue Root Cause Analysis and Improve System Quality
Job Responsibility
Job Responsibility
  • Design and implement systems and processes to improve the reliability, scalability, and performance of applications
  • Automate routine operational tasks, such as deployments, monitoring, and incident response, to improve efficiency and reduce human error
  • Develop and maintain monitoring tools and dashboards to track system health, performance, and availability
  • Respond to and resolve incidents promptly, conducting root cause analysis and implementing preventive measures
  • Provide ongoing maintenance and support for existing systems, ensuring that they are secure, efficient, and reliable
  • Work on integrating various software applications and platforms to ensure seamless operation across the organization
  • Implement and maintain security measures to protect systems from unauthorized access and other threats
What we offer
What we offer
  • Competitive and comprehensive Total Rewards Plans that are aligned with local industry standards
Read More
Arrow Right

Site Reliability Engineer III

Under limited supervision, the Site Reliability Engineer III is responsible for ...
Location
Location
United States , Birmingham
Salary
Salary:
Not provided
allianceautomotive.co.uk Logo
Alliance Automotive UK LV Ltd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Typically requires a bachelor's degree and five (5) or more years of related experience or an equivalent combination
  • Understanding of Kubernetes, containers, clusters, and elastic scalability
  • Expertise in SRE principles
  • Mindset of continually finding ways to drive scalability, stability, and performance
  • Cloud Services experience with Google Cloud Platform (GCP)
  • Experience with API, service-based or microservice-based architecture
  • Proficiency in infrastructure, network, database, operating systems, or security troubleshooting and remediation
  • Architecture-level knowledge of Windows and Linux and Infrastructure systems
  • Experience with production deployment, monitoring, and operational support for enterprise-class applications (Dynatrace a plus)
  • Experience working with Continuous Integration/ Continuous Deployment tools
Job Responsibility
Job Responsibility
  • Gathers and analyzes metrics from monitoring platforms to assist in performance tuning and fault tolerance
  • Partners with development teams to improve services through testing and release procedures
  • Participates in system design, platform management and capacity planning
  • Balances feature development speed and reliability with service-level objectives
  • Works closely with the incident response team and restoring service to normal operation
  • Understands debugging and applying troubleshooting skills
  • Investigates, blocks and rate-limits unwanted traffic
  • Utilizes monitoring systems and dashboards for proactive changes and alerting
  • Establishes continuous process improvement cycles where the process, performance, and supporting technologies are reviewed and enhanced where applicable
  • Performs other duties as assigned
What we offer
What we offer
  • options for healthcare coverage, 401(k), tuition reimbursement, vacation, sick, and holiday pay
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer III

We're looking for a senior Site Reliability Engineer to join our small, high-own...
Location
Location
United States
Salary
Salary:
148320.00 - 185400.00 USD / Year
absencesoft.com Logo
AbsenceSoft
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in SRE, DevOps, or a related engineering role
  • Advanced hands-on expertise in AWS production environments and core services including Lambda, ECS, S3, ALB, and GuardDuty
  • Strong proficiency in infrastructure-as-code tooling such as Terraform, CloudFormation, or CDK
  • Experience building and operating CI/CD pipelines using Jenkins and GitHub
  • Proficiency in Python, Go, or Bash for automation
  • Hands-on experience with Datadog or a comparable observability platform for monitoring, alerting, and log management
  • Demonstrated experience leading incident response in complex, distributed systems
  • Working knowledge of SLO/SLI frameworks, error budgets, and disaster recovery planning against defined RTO/RPO objectives
  • Familiarity with SOC 2 compliance frameworks and experience contributing to audit readiness, access controls, and security control evidence collection
  • A collaborative, ownership-driven mindset with strong communication skills
Job Responsibility
Job Responsibility
  • Architect, implement, and operate scalable, resilient, and secure AWS infrastructure
  • Lead infrastructure-as-code initiatives to ensure all environments are reproducible, auditable, and consistently configured
  • Design, maintain, and improve CI/CD pipelines using Jenkins and GitHub
  • Own the Datadog observability platform, including dashboards, monitors, alerting thresholds, and log management
  • Define and maintain SLOs, SLIs, and error budgets
  • Serve as a senior technical responder across the full incident lifecycle within a shared on-call rotation
  • Lead blameless postmortems
  • Refine, implement, and test disaster recovery plans to meet RTO/RPO objectives
  • Contribute to SOC 2 audit readiness with a focus on access controls, incident response, and risk mitigation
  • Mentor junior SREs through code reviews, incident pairing, and documentation
What we offer
What we offer
  • Impact that matters
  • Flexibility and trust
  • Remote-first and results driven
  • Growth and development
  • Access to learning resources, leadership programs, and real opportunities to take on new challenges
  • Competitive rewards
  • Comprehensive benefits
  • Performance-based bonus program
  • Equity opportunities
  • Time for life
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer III

Under limited supervision, the Site Reliability Engineer III is responsible for ...
Location
Location
United States , Birmingham, Alabama
Salary
Salary:
Not provided
genpt.com Logo
Genuine Parts Company
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Typically requires a bachelor's degree and five (5) or more years of related experience or an equivalent combination
  • Understanding of Kubernetes, containers, clusters, and elastic scalability
  • Expertise in SRE principles
  • Mindset of continually finding ways to drive scalability, stability, and performance
  • Cloud Services experience with Google Cloud Platform (GCP)
  • Experience with API, service-based or microservice-based architecture
  • Proficiency in infrastructure, network, database, operating systems, or security troubleshooting and remediation
  • Architecture-level knowledge of Windows and Linux and Infrastructure systems
  • Experience with production deployment, monitoring, and operational support for enterprise-class applications (Dynatrace a plus)
  • Experience working with Continuous Integration/ Continuous Deployment tools
Job Responsibility
Job Responsibility
  • Gathers and analyzes metrics from monitoring platforms to assist in performance tuning and fault tolerance
  • Partners with development teams to improve services through testing and release procedures
  • Participates in system design, platform management and capacity planning
  • Balances feature development speed and reliability with service-level objectives
  • Works closely with the incident response team and restoring service to normal operation
  • Understands debugging and applying troubleshooting skills
  • Investigates, blocks and rate-limits unwanted traffic
  • Utilizes monitoring systems and dashboards for proactive changes and alerting
  • Establishes continuous process improvement cycles where the process, performance, and supporting technologies are reviewed and enhanced where applicable
  • Performs other duties as assigned.
What we offer
What we offer
  • Options for healthcare coverage, 401(k), tuition reimbursement, vacation, sick, and holiday pay.
  • Fulltime
Read More
Arrow Right

Network Engineer III

We are seeking talented, experienced Network Engineering professionals to join t...
Location
Location
United States , Huntsville
Salary
Salary:
79119.18 - 190122.20 USD / Year
arcfield.com Logo
Arcfield
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Network Engineering, Computer Engineering, Computer Science, or a related technical field with 5-7 years of experience, MS 3-5 years of experience, PhD 0-2 years of experience
  • Valid Security+ or DoD Directive 8570.01 IAT Level II certification
  • Valid Cisco Certified Network Associate (CCNA) certification or higher
  • 3+ years of general work experience designing, developing, and implementing network architecture
  • 3+ years of experience configuring and troubleshooting routers, switches, and associated network protocols such as OSPF, EIGRP, and Rapid PVST
  • 3+ years of experience supporting network security hardware and solutions like Cisco ASA firewalls, IPSEC, Access Control Lists (ACLs), and Network Address Translation (NAT)
  • Experience with network analysis tools such as SolarWinds and Wireshark
  • Experience with Visio
  • Possess and maintain a Secret clearance
Job Responsibility
Job Responsibility
  • Develop and deploy communications architectures to meet dynamic mission requirements across multiple ranges
  • Design, install, maintain, and repair deployable communications sites, ensuring optimal performance and reliability
  • Route data with type 1 encryption between various ranges and sites, ensuring secure and reliable communication
  • Configure and secure enterprise services, including routers, switches, firewalls, and access control solutions
  • Troubleshoot and optimize routing protocols such as OSPF and EIGRP, along with QoS, VPNs, and Spanning Tree Protocol
  • Interface with analog voice communications systems across multiple locations to ensure seamless integration
  • Maintain and update systems with patches to comply with Risk Management Framework (RMF) Information Assurance Vulnerability Management (IAVM) requirements
  • Collaborate with systems administrators on other standalone systems to improve their network architecture
  • Develop comprehensive system documentation, including standard operating procedures and network drawings
  • Regularly report project status to the Lead Network Engineer and Senior Management
What we offer
What we offer
  • Health Insurance
  • Life Insurance
  • Paid Time Off
  • Holiday Pay
  • Short Term and Long-Term Disability
  • Retirement and Savings
  • Learning and Development opportunities
  • wellness programs
  • Fulltime
Read More
Arrow Right

Network Engineer III

We are seeking talented, experienced Network Engineering professionals to join t...
Location
Location
United States , Huntsville
Salary
Salary:
Not provided
arcfield.com Logo
Arcfield
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Network Engineering, Computer Engineering, Computer Science, or a related technical field with 5-7 years of experience, MS 3-5 years of experience, PhD 0-2 years of experience
  • Valid Security+ or DoD Directive 8570.01 IAT Level II certification
  • Valid Cisco Certified Network Associate (CCNA) certification or higher
  • 3+ years of general work experience designing, developing, and implementing network architecture
  • 3+ years of experience configuring and troubleshooting routers, switches, and associated network protocols such as OSPF, EIGRP, and Rapid PVST
  • 3+ years of experience supporting network security hardware and solutions like Cisco ASA firewalls, IPSEC, Access Control Lists (ACLs), and Network Address Translation (NAT)
  • Experience with network analysis tools such as SolarWinds and Wireshark
  • Experience with Visio
  • Possess and maintain a Secret clearance
Job Responsibility
Job Responsibility
  • Develop and deploy communications architectures to meet dynamic mission requirements across multiple ranges
  • Design, install, maintain, and repair deployable communications sites, ensuring optimal performance and reliability
  • Route data with type 1 encryption between various ranges and sites, ensuring secure and reliable communication
  • Configure and secure enterprise services, including routers, switches, firewalls, and access control solutions
  • Troubleshoot and optimize routing protocols such as OSPF and EIGRP, along with QoS, VPNs, and Spanning Tree Protocol
  • Interface with analog voice communications systems across multiple locations to ensure seamless integration
  • Maintain and update systems with patches to comply with Risk Management Framework (RMF) Information Assurance Vulnerability Management (IAVM) requirements
  • Collaborate with systems administrators on other standalone systems to improve their network architecture
  • Develop comprehensive system documentation, including standard operating procedures and network drawings
  • Regularly report project status to the Lead Network Engineer and Senior Management
  • Fulltime
Read More
Arrow Right

Software Engineer Level III – Forward Deployed

We are seeking a skilled Software Engineer who will design, build, and maintain ...
Location
Location
China , Shanghai; Dalian; Wuhan
Salary
Salary:
Not provided
pfizer.de Logo
Pfizer
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or related field with 5-8 years of relevant experience
  • AI-Augmented Development: integrate AI tools strategically into development workflow, review AI-generated code with rigor
  • Business Immersion: apply deep domain knowledge to technical solutions, bridge business and technology conversations
  • Data Integration: integrate multiple data sources independently, clean messy datasets
  • Full-Stack Development: deliver complete features end-to-end independently—frontend, backend, database, and infrastructure
  • Multi-Audience Communication: present complex topics clearly to any audience, translate between technical and business language
  • Problem Discovery: navigate ambiguous problem spaces independently, discover requirements through observation
  • Rapid Prototyping & Validation: deliver working solutions rapidly (days not weeks)
  • Site Reliability Engineering: design observability strategies for services, lead incident response
  • Stakeholder Management: manage multiple stakeholders with different interests
Job Responsibility
Job Responsibility
  • Delivery: Own feature delivery from design through deployment, making sound technical trade-offs to ship value on time
  • AI: Integrate AI capabilities into solutions, critically evaluate AI-generated code
  • People: Mentor junior engineers on technical topics, contribute to hiring through interviews
  • Business: Translate business needs into technical solutions, manage stakeholder expectations
  • Process: Contribute to process improvement, maintain team workflows
  • Documentation: Create clear documentation for features you build, contribute to team knowledge bases
  • Fulltime
Read More
Arrow Right