CrawlJobs Logo

Cloud Security Site Reliability Engineer

Singapore, Singapore Employment contract · Job Posted November 14, 2025

Job offer has expired

Job Link Share

Job Description

This role sits within the Cloud Security team at Citi and focuses on building tools and platforms for Private and Public Cloud Security, Container Security, and Secrets products. Engineering excellence, collaboration, and reducing TOIL are key principles of this role. The candidate will help raise the bar in the SRE strategy and support Citi’s multiyear transformation journey.

Job Responsibility

  • Working across Container products and Secrets products, across Public and Private Cloud, as well as Cloud native specific products
  • Architecting and building tools and platforms that provide capabilities for SRE
  • Collaboration with multiple stakeholders and partners across Engineering and Operations as well as partner teams within the wider Citi organisation
  • Actively owning production level incidents till resolution.

Requirements

  • Bachelor’s degree or equivalent work experience
  • 6+ years of relevant work experience
  • Highly motivated self-starter with excellent interpersonal and communication skills
  • Certification or formal training in site reliability engineering concepts and practices
  • Prior experience working towards SLIs, SLOs and observability capabilities at a large scale
  • 4+ years experience in Python or Java on large scale systems alongside Linux based scripting languages
  • Experience working on observability, logging and metrics toolsets
  • Experience of k8s and container technologies such as Docker, Openshift and EKS
  • Experience with public cloud technologies such as AWS, GCP or Azure
  • Experience with Secrets products such as HashiCorp Vault or CyberArk
  • Highly effective navigating large and complex organisations
  • Ability to work under pressure and manage to tight deadlines or unexpected changes in expectations or requirements
  • Experience working in CISO or security led organisations desirable but not essential.

What we offer

Global benefits designed to support well-being, growth, and work-life balance.

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Cloud Security Site Reliability Engineer

8 matching positions

Staff Site Reliability Engineer - Cloud

Elevate Global Operations as our Next Cloud Site Reliability Engineer (OpenTelem...
Location
Location
United Kingdom
Salary
Salary:
Not provided
trimble.com Logo
Trimble Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands-on experience with the OpenTelemetry Collector, APIs, and SDKs
  • Extensive experience with observability tools like NewRelic, Datadog, or Splunk
  • Strong proficiency in Infrastructure as Code (Terraform, Ansible) and cloud platforms (AWS, GCP, or Azure)
  • Deep understanding of containerization and orchestration using Docker and Kubernetes
  • Advanced coding skills in Python, Go, or Java for building robust automation and monitoring tools
  • Experience leveraging AI coding assistants like GitHub Co-Pilot to accelerate development
Job Responsibility
Job Responsibility
  • Lead a global "OTel First" strategy, implementing OpenTelemetry at scale across a diverse technological landscape
  • Spearhead the development of automation scripts and Infrastructure as Code using Terraform to ensure seamless, reproducible platform delivery
  • Optimize platform performance and cost-efficiency, ensuring our observability tools scale economically as our data grows
  • Collaborate with engineering teams to embed reliability and security standards into new features from the ground up
  • Drive root cause analysis and problem management to proactively prevent incidents and improve the customer experience
Read More
Arrow Right

Cloud Site Reliability Engineer

Airbus Commercial Aircraft is looking for a Cloud Site Reliability Engineer (f/m...
Location
Location
France , Toulouse
Salary
Salary:
Not provided
airbus.com Logo
Airbus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Expert-level proficiency in GCP Deployment Manager, Google Cloud CDK (Cloud Development Kit), and Python
  • Comprehensive and hands-on knowledge of a wide range of GCP services and their architectural best practices
  • Strong background and practical experience in cloud security principles and compliance frameworks
  • Proven experience with DevOps methodologies and working within agile environments
  • Demonstrated ability to architect and implement scalable infrastructure solutions
  • A strategic thinker with a keen focus on optimizing infrastructure efficiency, scalability, and cost-effectiveness
  • A collaborative team player with excellent communication skills, capable of working effectively across engineering teams.
Job Responsibility
Job Responsibility
  • Spearhead the development and maintenance of our cloud infrastructure using GCP Deployment Manager templates and CDK scripts for efficient resource provisioning
  • Architect and implement highly scalable, secure, and resilient cloud infrastructure solutions across various GCP services
  • Collaborate closely with development and operations teams to deeply integrate IaC practices into our entire software development lifecycle
  • Conduct thorough code reviews and mentor team members, ensuring adherence to best practices in cloud architecture, security, and operational excellence
  • Drive continuous improvement in our infrastructure by identifying opportunities for automation, optimization, and enhanced reliability.
What we offer
What we offer
  • Financial rewards: Attractive salary, agreements on success and profit sharing schemes, employee savings plan abounded by Airbus and employee stock purchase plan on a voluntary basis
  • Work / Life Balance: Extra days-off for special occasions, holiday transfer option, a Staff council offering many social, cultural and sport activities and other services
  • Wellbeing / Health: Complementary health insurance coverage (disability, invalidity, death). Depending on the site: health services center, concierge services, gym, carpooling application
  • Individual development: Great upskilling opportunities and development prospects with unlimited access to +10.000 e-learning courses to develop your employability, certifications, expert career path, accelerated development programmes, national and international mobility.
  • Fulltime
Read More
Arrow Right

Cloud Site Reliability Engineer

Airbus Commercial Aircraft is looking for a Cloud Site Reliability Engineer (f/m...
Location
Location
France , Toulouse
Salary
Salary:
Not provided
airbus.com Logo
Airbus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Expert-level proficiency in GCP Deployment Manager, Google Cloud CDK (Cloud Development Kit), and Python
  • Comprehensive and hands-on knowledge of a wide range of GCP services and their architectural best practices
  • Strong background and practical experience in cloud security principles and compliance frameworks
  • Proven experience with DevOps methodologies and working within agile environments
  • Demonstrated ability to architect and implement scalable infrastructure solutions
  • A strategic thinker with a keen focus on optimizing infrastructure efficiency, scalability, and cost-effectiveness
  • A collaborative team player with excellent communication skills, capable of working effectively across engineering teams
Job Responsibility
Job Responsibility
  • Spearhead the development and maintenance of our cloud infrastructure using GCP Deployment Manager templates and CDK scripts for efficient resource provisioning
  • Architect and implement highly scalable, secure, and resilient cloud infrastructure solutions across various GCP services
  • Collaborate closely with development and operations teams to deeply integrate IaC practices into our entire software development lifecycle
  • Conduct thorough code reviews and mentor team members, ensuring adherence to best practices in cloud architecture, security, and operational excellence
  • Drive continuous improvement in our infrastructure by identifying opportunities for automation, optimization, and enhanced reliability
What we offer
What we offer
  • Financial rewards: Attractive salary, agreements on success and profit sharing schemes, employee savings plan abounded by Airbus and employee stock purchase plan on a voluntary basis
  • Work / Life Balance: Extra days-off for special occasions, holiday transfer option, a Staff council offering many social, cultural and sport activities and other services
  • Wellbeing / Health: Complementary health insurance coverage (disability, invalidity, death). Depending on the site: health services center, concierge services, gym, carpooling application
  • Individual development: Great upskilling opportunities and development prospects with unlimited access to +10.000 e-learning courses to develop your employability, certifications, expert career path, accelerated development programmes, national and international mobility
  • Fulltime
Read More
Arrow Right

Sr Principal Site Reliability Engineer (Sovereign Cloud)

The Prisma Access team is seeking a seasoned Principal Site Reliability Engineer...
Location
Location
Bulgaria , Sofia
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in Infrastructure, SRE, or DevOps roles
  • BS or MS in Computer Science, a related field, or equivalent professional experience
  • 7+ years of experience with GCP, and expertise in their architecture, services and PKI concepts for cloud security
  • Expert troubleshooting skills to resolve cloud infrastructure and service issues, effectively identifying root cause and devising effective solutions
  • Proficiency in automation using Python and shell scripting
  • Expertise in Infrastructure as Code (IaC) with Terraform and Helm, leveraging AI tools for development
  • Solid experience with Kubernetes, container networking, and container workloads
  • Strong Linux administration skills
  • Proficiency with CI/CD pipelines, GitOps principles, and tooling like GitLab and Jenkins
  • Excellent written and verbal communication skills, with the ability to collaborate effectively to drive outcomes
Job Responsibility
Job Responsibility
  • Design, build, and operate reliable, secure Cloud infrastructure across multi-cloud environments for our sovereign customers
  • Lead cross-functional initiatives to ensure applications are production-ready, scalable, secure, and resilient
  • Develop expertise in new technologies, embracing continuous learning and the adoption of AI tools
  • Develop tools and automation frameworks, championing Infrastructure as Code (IaC) and Monitoring as Code (MaC) principles
  • Automate robust deployments and orchestrate end-to-end monitoring and alerting solutions
  • Participate in on-call rotations to support critical business and production systems
  • Lead root cause analysis of critical issues, driving improvements and preventing recurrence
  • Champion the success of SRE and DevOps initiatives, aligning technical decisions with business goals
  • Fulltime
Read More
Arrow Right

Sr Principal Site Reliability Engineer (Sovereign Cloud)

Palo Alto Networks runs a large infrastructure and is one of the largest GCP cus...
Location
Location
Bulgaria , Sofia
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering
  • 7+ years building high availability, scalable cloud-native applications on AWS and GCP
  • BS or MS in Computer Science, a related field, or equivalent professional experience required
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Passion for infrastructure and monitoring as code
  • Solid experience in container workloads and Kubernetes
  • Familiarity with PKI concepts, Networking concepts
  • In-depth knowledge of different security controls ( app-id, user-id, security profile, url category, content, ssl decryption, firewall MFA etc)
  • Linux administration, internals, and network troubleshooting
  • Proficiency with programming languages like Golang or Python along with shell scripting to automate tasks
Job Responsibility
Job Responsibility
  • Contribute to the success of SRE and DevOps
  • Develop expertise in new technologies
  • Work with developers, researchers, data scientists, and security experts
  • Design, build and operate reliable, secure Cloud infrastructure
  • Ensure that applications are production-ready, scalable, and reliable
  • Develop tools and automation frameworks
  • Automate robust deployment of robust services
  • Orchestrate end-to-end monitoring and alerting
  • Participate in on-call rotations to support critical business and production systems
  • Lead root cause analysis of critical business and production issues
  • Fulltime
Read More
Arrow Right

Principal Site Reliability Engineer (Sovereign Cloud)

As a Principal Site Reliability Engineer, you will serve as the technical author...
Location
Location
Bulgaria , Sofia
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in Infrastructure, SRE, or DevOps roles
  • BS or MS in Computer Science, a related field, or equivalent professional experience
  • Kubernetes Mastery: Expert-level experience (6+ years) managing production K8s workloads (preferably within GKE, but will also consider EKS)
  • Deep understanding of Networking, Storage, and RBAC
  • CI/CD & GitOps: Hands-on expertise with ArgoCD and modern pipeline runners (GitHub Actions, GitLab CI, or Jenkins)
  • Programming: Proficient in Python for systems programming and automation
  • Security Mindset: Proven experience integrating security scanning and compliance checks within a containerized environment
  • Modern Workflow: Experience (or strong desire) using AI-pair programming tools like Cursor and Claude to multiply personal and team productivity
  • Excellent written and verbal communication, able to collaborate and rally support
  • Self-disciplined, self-managed, self-motivated, strong sense of ownership, urgency, and drive
Job Responsibility
Job Responsibility
  • Infrastructure Leadership: Architect and oversee large-scale Kubernetes clusters in GKE, ensuring high availability, performance tuning, and cost optimization
  • GitOps & Orchestration: Design and refine complex CI/CD lifecycles using ArgoCD, moving toward a fully declarative infrastructure-as-code model
  • Security Engineering: Implement and manage security scanning tools (e.g., Prisma Cloud, Snyk, or GKE native security) to ensure container integrity and shift-left security compliance
  • Automation & Tooling: Develop sophisticated automation scripts and internal tools using Python to eliminate manual toil and improve system observability
  • AI-Driven Development: Lean into the future of engineering by utilizing Cursor and Claude to accelerate coding, debugging, and documentation tasks
  • Incident Management: Act as a final escalation point for complex infrastructure outages, conducting blameless post-mortems to drive systemic improvements
  • Participate in on-call rotations to support critical business and production systems
  • Fulltime
Read More
Arrow Right

Principal Site Reliability Engineer (Sovereign Cloud)

Your Career: Palo Alto Networks runs a large infrastructure and is one of the la...
Location
Location
Bulgaria , Sofia
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years as an engineer in Infrastructure, Operations, DevOps, or System Engineering
  • 7+ years building high availability, scalable cloud native applications on AWS or GCP
  • BS or MS in Computer Science, a related field, or equivalent professional experience or equivalent military experience required
  • Expertise in configuration management with a framework such as Ansible, Terraform, Helm
  • Expertise in infrastructure automation tasks using Python and shell scripting
  • Experience in Site Reliability Engineering, Production Engineering, or DevOps
  • Expertise in public or private cloud
  • Solid experience in Kubernetes and containers
  • Linux administration, internals, and network troubleshooting
  • Proficiency with programming languages like Python, Java, Golang, and shell scripting to automate tasks
Job Responsibility
Job Responsibility
  • Contribute to the success of SRE and DevOps
  • Develop expertise in new technologies
  • Work with developers, researchers, data scientists, and security experts
  • Design, build and operate reliable, secure Cloud infrastructure
  • Ensure that applications are production-ready, scalable, and reliable
  • Develop tools and automation frameworks
  • Automate robust deployment of robust services
  • Orchestrate end-to-end monitoring and alerting
  • Participate in on-call rotations to support critical business and production systems
  • Lead root cause analysis of critical business and production issues
  • Fulltime
Read More
Arrow Right

Site Reliability Engineer (FedRAMP / Security) - CA

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
United States , Los Angeles
Salary
Salary:
170000.00 - 220000.00 USD / Year
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 5 years of experience as a DevOps Engineer/ SRE in production environments
  • In-depth experience with Kubernetes - operating & monitoring are key parts
  • At least 2 years of experience Experience with FedRAMP compliance (High/Moderate levels), vulnerability management, and continuous monitoring, including scanning, patching, and reporting - advantage
  • High familiarity with monitoring tools such as Coralogix, Grafana, Prometheus
  • Experience in AWS or other cloud providers
  • Experience with infrastructure as a code (Terraform, Crossplane, etc.)
  • Understanding of networking - from networking layers to different networking protocols (http, grpc, ssl)
  • Some software engineering experience, preferably in Golang
  • An advantage - operating data pipelines
  • An advantage - familiarity with Apache Kafka
Job Responsibility
Job Responsibility
  • Work in high scale environments - Coralogix data pipeline processes 55Tb of data each day
  • Adopt cutting edge technologies with end-to-end responsibility
  • Building internal tools to expand our platform capabilities
  • Collaborate with R&D to improve stability & reliability of the system
  • Lead the product roadmap - our product is designed for engineers. Therefore, our engineers promote, enhance, and take a crucial part in influencing the product roadmap
  • Perform operational duties for FedRAMP cloud products, including deployments, on-call support, and incident management
What we offer
What we offer
  • Healthcare
  • Dental
  • Mental health benefits
  • 401(k) plan and match
  • Paid sick time
  • Paid time off
  • Fulltime
Read More
Arrow Right