CrawlJobs Logo

SRE Developer

India, Bangalore South · Job Posted March 26, 2026
Apply Position
Job Link Share

Job Description

We are looking for a proactive SRE Developer with 3–5 years of experience to manage Business‑As‑Usual (BAU) SRE operations while driving automation, reliability, and operational excellence. The role focuses on incident management, CI/CD operations, observability, and leveraging AI‑assisted tools to reduce manual effort and improve system reliability across cloud‑native environments.

Job Responsibility

  • Handle SRE BAU operations including incident management, root cause analysis, problem resolution, and service restoration
  • Manage and maintain CI/CD pipelines and deployment automation across environments
  • Improve system reliability, scalability, and performance through automation and proactive monitoring
  • Implement and manage observability solutions including logging, metrics, alerting, and dashboards
  • Utilize AI tools (CursorAI, Generative AI, automation copilots) for faster troubleshooting, documentation, code generation, and incident analysis
  • Collaborate with engineering, product, and security teams to ensure smooth releases and secure infrastructure
  • Reduce manual operational effort through AI-assisted automation and scripting
  • Drive DevOps best practices and continuous improvement initiatives

Requirements

  • Strong hands-on experience in SRE or DevOps operations
  • Expertise in CI/CD tools such as GitHub Actions, GitLab CI, Jenkins, Azure DevOps
  • Experience with monitoring and observability tools (Grafana, Prometheus, ELK, Splunk, Datadog, New Relic, etc.)
  • Good understanding of cloud platforms (AWS, Azure, or GCP)
  • Practical experience using AI tools in daily engineering workflows (CursorAI, ChatGPT, GenAI tools, automation assistants)
  • Ability to identify repetitive operational tasks and automate using AI or scripts
  • Familiarity with AI-driven troubleshooting and documentation
  • Proficiency in Python, Bash, PowerShell, or similar scripting languages
  • Exposure to Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, ARM, or Ansible

Nice to have

  • Experience supporting production environments with on-call rotations
  • Knowledge of containerization and orchestration (Docker, Kubernetes)
  • Understanding of performance tuning and capacity planning
  • Experience integrating AI into operational workflows or automation pipelines
  • Strong ownership mindset, adaptability, and continuous improvement attitude
  • Excellent communication and cross‑team collaboration skills

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

SRE Developer

8 matching positions

Sre (Developer Relations)

Location
Location
Japan , 東京23区
Salary
Salary:
7000000.00 - 10000000.00 JPY / Year
https://www.randstad.com Logo
Randstad
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Fluent in English
  • Minimum 4 years of experience as an SRE engineer or Infrastructure Engineer
  • Experience in consulting / forward deployed engineering (FDE) experience
  • Experience with Kubernetes
  • Experience with debugging, problem solving, and resolving incidents
  • Experience with application development
  • Experience in multiple widely-used programming languages
  • Experience in AWS, GitHub, JIRA/Confluence, Slack, Linux (bash, CLI)
What we offer
What we offer
  • 健康保険
  • 厚生年金保険
  • 雇用保険
  • Fulltime
Read More
Arrow Right

SRE Ansible developer

Location
Location
Canada , Toronto
Salary
Salary:
155000.00 USD / Year
realign-llc.com Logo
Realign
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Design and implement automation scripts using Ansible for infrastructure provisioning and configuration management
  • Develop and maintain monitoring solutions leveraging Dynatrace for application and system performance
  • Configure and optimize ITRS monitoring tools to ensure proactive alerting and incident management
  • Collaborate with development and operations teams to improve system reliability and scalability
  • Automate deployment pipelines and integrate with CICD processes for faster releases
  • Troubleshoot performance issues and implement solutions to enhance system resilience
  • Ensure compliance with security and operational standards across environments
  • Document automation workflows, monitoring configurations, and best practices for knowledge sharing
  • Total Experience: 6-8 years
  • Fulltime
Read More
Arrow Right

Python Developer - Site Reliability Engineering (SRE)

We are seeking a skilled Python Developer with experience in the Site Reliabilit...
Location
Location
Canada , Montreal
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience with Python development
  • 6 years of experience working with Infrastructure as Code (Terraform and Ansible)
  • Experience with CI/CD pipelines, preferably GitHub Actions and Jenkins
  • Strong understanding of object-oriented design and development principles
  • Proficiency in Linux/Unix environments
  • Experience working with database technologies (preferably NoSQL), including data modeling, testing, and performance tuning
  • Ability to write reusable, optimized, maintainable, and well‑documented code following industry best practices
  • Experience implementing open-source monitoring and observability tools such as Prometheus, Grafana, Splunk or Open Telemetry
  • Strong problem‑solving skills and ability to take ownership of tasks and drive them independently to closure
  • Understanding of networking concepts (TCP/IP, DNS, Load Balancing)
Job Responsibility
Job Responsibility
  • Develop quality software working with public cloud service provider (CSP) infrastructure across different Public Cloud areas
  • Develop, enhance, and integrate automation workflows for Public Cloud Service Providers (CSP), initially focused on Azure, and integrate with in-house tooling
  • Integrate automation workflows into CI/CD pipelines using GitHub Actions and Jenkins
  • Build proof-of-concept solutions in new areas of cloud and automation development
  • Provide technical support and debugging for application failures in both on-premises and cloud environments
  • Participate in all phases of the Software Development Life Cycle (SDLC), including analysis, design, coding, testing, and deployment
  • Evaluate, onboard, and implement emerging DevOps and automation tools to improve efficiency
  • Build and integrate observability into cloud platforms and solutions using open-source tools (Prometheus, Grafana, OpenTelemetry)
  • Identify, highlight, and reduce operational toil through automation, architectural improvements, and process optimization
  • Collaborate with global teams to understand requirements, develop high‑quality code, and deliver cloud-focused projects
Read More
Arrow Right

Credit Risk Counterparty Analyst - SRE

Join us at Barclays as a Credit Risk Counterparty Analyst - Site Reliability Eng...
Location
Location
United Kingdom , Glasgow
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands-on/technical experience with high proficiency in SQL, Database Technologies, Unix, Windows, primarily within Investment Banking domain
  • Experience with ITIL concepts and best practices
  • Experience of using configuration management tools and reporting (preferred Service Management Tool - Service First / SNOW)
  • Experience in batch monitoring tools (preferably, Autosys)
Job Responsibility
Job Responsibility
  • Provision of technical support for the service management function to resolve more complex issues for a specific client of group of clients
  • Develop the support model and service offering to improve the service to customers and stakeholders
  • Execution of preventative maintenance tasks on hardware and software and utilisation of monitoring tools/metrics to identify, prevent and address potential issues and ensure optimal performance
  • Maintenance of a knowledge base containing detailed documentation of resolved cases for future reference, self-service opportunities and knowledge sharing
  • Analysis of system logs, error messages and user reports to identify the root causes of hardware, software and network issues, and providing a resolution to these issues by fixing or replacing faulty hardware components, reinstalling software, or applying configuration changes
  • Automation, monitoring enhancements, capacity management, resiliency, business continuity management, front office specific support and stakeholder management
  • Identification and remediation or raising, through appropriate process, of potential service impacting risks and issues
  • Proactively assess support activities implementing automations where appropriate to maintain stability and drive efficiency
  • Actively tune monitoring tools, thresholds, and alerting to ensure issues are known when they occur
What we offer
What we offer
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution
  • Fulltime
Read More
Arrow Right

Technical Architect

Lead the design, modernization, and implementation of scalable, secure, and resi...
Location
Location
United States , Armonk
Salary
Salary:
247319.00 - 250000.00 USD / Year
nytimes.com Logo
The New York Times
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree or equivalent in Computer Science, Information Technology, Engineering or related and five (5) years of experience as a Consultant Architect, Virtualization Architect, Senior Cloud Architect or related
  • Five (5) years of experience must include utilizing Hybrid Cloud, AWS, Azure, Red Hat Linux, Terraform, Ansible, Python, VMware Cloud Foundation (VCF) Stack
Job Responsibility
Job Responsibility
  • Lead the design, modernization, and implementation of scalable, secure, and resilient hybrid cloud and containerized infrastructure platforms
  • Define and lead the technical architecture strategy for hybrid cloud, container orchestration (Kubernetes, RedHat OpenShift, VMware Tanzu), and virtualized environments (VMware, Nutanix, RedHat)
  • Architect secure and scalable infrastructure across private, public, and hybrid cloud ecosystems
  • Evaluate, design, and implement solutions for computing, storage, networking, identity, and availability zones across global regions
  • Design and implement Kubernetes, RedHat OpenShift clusters across multi-cloud and on-prem environments, including CI/CD integration, policy enforcement, and workload orchestration
  • Define governance, observability, and security patterns for containerized workloads
  • Lead Infrastructure-as-Code (IaC) initiatives using Terraform, Ansible, GitOps, GitHub, PowerShell, and Python
  • Enable self-service infrastructure capabilities through automation frameworks and developer platforms
  • Partner with DevSecOps, SRE, Infrastructure Operations, Security, and Datacenter Operation teams to scope, define, size, and execute application onboarding, modernization, and consolidation initiatives
  • Mentor engineering teams and influence enterprise architecture (EA) roadmaps
  • Fulltime
Read More
Arrow Right

Principal AI Architect

Wells Fargo is seeking a visionary Principal Systems Architect to shape the futu...
Location
Location
United States , Iselin
Salary
Salary:
159000.00 - 305000.00 USD / Year
https://www.wellsfargo.com/ Logo
Wells Fargo
Expiration Date
June 25, 2026
Flip Icon
Requirements
Requirements
  • 7+ years of architecture experience
  • 7+ years of experience creating strategy
  • 2+ years of AI, GenAI, and Agentic AI solutions with Model Risk Management (MRM) and Artificial Intelligence Risk Review (AIRR) governance requirements
Job Responsibility
Job Responsibility
  • Act as an advisor to leadership to develop or influence applications, network, information security, database, operating systems, or web technologies for highly complex business and technical needs across multiple groups
  • Lead the strategy and resolution of highly complex and unique challenges requiring in-depth evaluation across multiple areas or the enterprise, delivering solutions that are long-term, large-scale and require vision, creativity, innovation, advanced analytical and inductive thinking
  • Translate advanced technology experience, an in-depth knowledge of the organizations tactical and strategic business objectives, the enterprise technological environment, the organization structure, and strategic technological opportunities and requirements into technical engineering solutions
  • Provide vision, direction and expertise to leadership on implementing innovative and significant business solutions
  • Artificial Intelligence (AI) and Innovation - Promote a data-driven culture and drive architecture led-innovation
  • Lead architecture alignment for AI, GenAI, and Agentic AI solutions with Model Risk Management (MRM) and Artificial Intelligence Risk Review (AIRR) governance requirements, ensuring designs support required risk assessments, approvals, and enterprise control expectations
  • Partner with Model Risk Management, BCM, Legal, Compliance, Cyber, Data Use Assessment, and Risk Assessable Unit (RAU)-aligned stakeholders to ensure AI-enabled solutions are designed for appropriate model risk ranking, validation, explainability, control uplift, and readiness for AIRR and related tollgates where applicable
  • Define architecture patterns and engineering guardrails that support responsible AI, including traceability, monitoring, auditability, human-in-the-loop controls, secure data usage, resiliency, and change management across the AI service lifecycle
  • Ensure target-state architectures and implementation roadmaps account for post-deployment monitoring, control sustainability, and re-assessment triggers associated with model changes, scope expansion, data/input changes, platform changes, and evolving regulatory requirements
  • Advise business, product, and engineering leaders on how to accelerate AI adoption while meeting enterprise expectations for risk governance, model oversight, policy adherence, and safe deployment at scale
What we offer
What we offer
  • Health benefits
  • 401(k) Plan
  • Paid time off
  • Disability benefits
  • Life insurance, critical illness insurance, and accident insurance
  • Parental leave
  • Critical caregiving leave
  • Discounts and savings
  • Commuter benefits
  • Tuition reimbursement
  • Fulltime
Read More
Arrow Right

Public Cloud Network Lead

Join us at Barclays as a Public Cloud Network Lead, to architect, implement and ...
Location
Location
United Kingdom , London; Glasgow
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Multi-Cloud Network Architecture & Hybrid Connectivity – Lead enterprise-scale network design across AWS, Azure, and GCP, delivering hybrid connectivity, encrypted interconnects (MACsec/IPsec), circuit provider management, and legacy infrastructure remediation through Infrastructure as Code
  • Network Security & Compliance – Implement Zero Trust segmentation, deploy cloud-native firewall controls, and ensure compliance with PCI-DSS, DORA, and internal governance frameworks
  • Strategic Planning, Consultancy & Stakeholder Engagement – Define cloud network strategy, evaluate emerging technologies, produce ADRs and HLD/LLD designs, lead Landing Zone design, and influence senior stakeholders on risk, strategy, and cost optimisation
  • Operational Excellence & Incident Response – Own incident escalation, SLA/SLO monitoring, flow analysis, and SRE enablement to drive network operational excellence
  • Automation, IaC & DevOps Practices – Build reusable Terraform, CloudFormation, and Bicep IaC with CI/CD pipelines and Python/Bash automation for standardised network provisioning
Job Responsibility
Job Responsibility
  • architect, implement and operate enterprise-grade multi-cloud network infrastructure at scale for Barclays
  • design secure, high-performance hybrid and multi-cloud architectures connecting thousands of cloud accounts across global regions to Barclays' on-premises infrastructure
  • work horizontally across GTIS Networks, SRE, DevOps, Product, and senior leadership to deliver strategic initiatives and resolve complex technical debt
  • mentor engineers and serving as the escalation point for critical network incidents
  • Build Engineering: Development, delivery, and maintenance of high-quality infrastructure solutions to fulfil business requirements
  • Incident Management: Monitoring of IT infrastructure and system performance to measure, identify, address, and resolve any potential issues, vulnerabilities, or outages
  • Automation: Development and implementation of automated tasks and processes to improve efficiency and reduce manual intervention
  • Security: Implementation of a secure configuration and measures to protect infrastructure against cyber-attacks, vulnerabilities, and other security threats
  • Teamwork: Cross-functional collaboration with product managers, architects, and other engineers to define IT Infrastructure requirements, devise solutions, and ensure seamless integration and alignment with business objectives
  • Learning: Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth
What we offer
What we offer
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution
  • Fulltime
Read More
Arrow Right

Staff Engineer, Site Reliability Engineer

OnStar is a cornerstone of General Motors' connected services—bringing safety, s...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
gm.com Logo
General Motors
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in SRE, DevOps, or systems engineering, including experience managing or mentoring high-impact teams
  • Track record of building and maintaining high-scale, cloud-native systems (preferably AWS, GCP, or Azure)
  • Expertise in container orchestration and deployment strategies using Kubernetes and CI/CD pipelines
  • Proficiency in Python, Go, or Java, with strong code review and readability standards
  • Experience leading cross-functional infrastructure projects, configuration strategy, or organizational tooling initiatives
  • Ability to think and act under pressure
  • Strong communication skills
Job Responsibility
Job Responsibility
  • Lead the design and implementation of scalable, fault-tolerant, and observable infrastructure supporting OnStar mobile and web experiences, in-vehicle services, and the backend platforms and integrations that power them
  • Champion configuration management, infrastructure refactoring, and testing frameworks to strengthen system resilience
  • Partner across SRE, development, and product teams to improve service reliability, deployment safety, and incident response practices
  • Drive internal consultation and strategic planning on reliability standards for new OnStar capabilities, customer-facing releases, and platform initiatives
  • Define and evolve observability strategy using tools such as Prometheus, Grafana, and Datadog, with automated alerting and actionable SLO dashboards
  • Own and improve on-call practices, manage blameless postmortems, and guide root cause analysis to eliminate recurring failures
  • Mentor engineers and help shape a high-performance culture rooted in extreme ownership and operational excellence
  • Support compliance and privacy-driven engineering initiatives across connected services, with potential crossover into areas like data retention and safety certification tooling
  • Fulltime
Read More
Arrow Right