CrawlJobs Logo

Incident Engineer

adyen.com Logo

Adyen

Location Icon

Location:
India , Bengaluru

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

A team within Global Platform Operations under the Monitoring Engineering pillar exhibits an unwavering attention to detail and a deep understanding of the platform wide monitoring implications to all merchants. In this role, you will be on-call monitoring platform performance, communicating with merchants, working on monitoring frameworks, providing feedback to product engineering teams to improve the reliability of the platform. You will initiate and lead initiatives across our platform offerings prioritizing merchant impact to proactively detect any issues and inform merchants quickly.

Job Responsibility:

  • Participate in 24/7 on-call monitoring
  • Observe platform and merchant performance and detect any issues proactively to mitigate risks in partnership with Engineering teams
  • Be an expert in communicating with merchants real time during an incident and present the most accurate and updated information to keep them informed
  • Working together with Operations, Product, Engineering, and reliability teams to integrate, grow, and continuously improve our monitoring strategy and increase our reliability
  • Improve operations by leading/project managing initiatives and, or tools—development of automation for effective monitoring
  • Investigate alerts and provide feedback to engineering teams to build effective logging and alerts across the platform architecture
  • Mitigate merchant impact risk by actioning on alerts in partnership with Engineering teams, and contribute to the monitoring playbook by documenting your learnings
  • Focus on ruthlessly prioritizing, automating, and scaling every aspect of our detection capabilities

Requirements:

  • You have at least 5 to 10 years of experience with incident client communication and platform monitoring operations
  • You're willing to participate in the on-call rotation and work in a fast-paced, dynamic environment
  • You have experience with monitoring and logging tools like Prometheus, Grafana, ELK Stack, etc
  • You have experience with observability platforms like Datadog, Dynatrace, Splunk
  • You have excellent analytical and problem-solving skills, with the ability to analyze complex systems and spot the root cause of issues
  • You thrive in an environment where collaboration is crucial and where a global approach is key for are you successful implementation of processes and projects
  • You have a passion for defining and standardizing processes to drive strategic improvement and able to translate complex technical concepts with ease for all non technical audiences
  • You have a natural ability for handling complex situations and multiple responsibilities simultaneously
  • You're a strong team player and thrive in a dynamic environment

Additional Information:

Job Posted:
May 03, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Incident Engineer

Manager, Site Reliability Engineering and Incident Management

Planet DDS is seeking a Manager, Site Reliability Engineering and Incident Manag...
Location
Location
United States , Atlanta
Salary
Salary:
118000.00 - 160000.00 USD / Year
planetdds.com Logo
Planet DDS
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years in SRE, DevOps, or Infrastructure roles
  • 3+ years in Incident Management leadership
  • Deep understanding of reliability, scalability, and performance optimization
  • Multi-cloud expertise in AWS, Azure, or GCP
  • Understanding of DNS, load balancing, firewalls, and compliance frameworks
  • Knowledge of fundamental cloud security (e.g., identity and access management, firewalls)
  • Deep understanding of logging and monitoring and security best practices
  • Strong collaboration and communication skills
  • Bachelor’s Degree in a relevant major or equivalent years of experience is a plus
Job Responsibility
Job Responsibility
  • Lead and mentor a team of SREs and Incident Managers
  • Foster a culture of reliability, accountability, and continuous improvement
  • Collaborate with engineering teams to design resilient platform architectures
  • Oversee the incident response process for outages and service disruptions
  • Ensure timely detection, escalation, and resolution of incidents
  • Drive post-incident reviews (PIRs) and root cause analysis
  • Implement improvements based on lessons learned to prevent recurrence
  • Mature and enforce best practices for incident response and runbooks
  • Automate operational tasks to reduce toil and improve efficiency
  • Maintain observability tools (monitoring, alerting, logging)
  • Fulltime
Read More
Arrow Right

Incident Response Security Engineer

The Security Team is responsible for providing key security capabilities coverin...
Location
Location
Canada
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Background in product security / red teaming / penetration testing / threat modeling, combined with incident detection and response experience
  • Strong knowledge of and experience with one or more cloud service providers (e.g. AWS, GCP, Azure)
  • Excellent written and verbal communication skills
  • Experience securing large-scale customer-facing cloud infrastructures
  • Significant development and automation experience
  • preference for Golang and Python
Job Responsibility
Job Responsibility
  • Develop processes, tooling and automation to scale incident management response and mitigate risks to the business
  • Collaborate with other security functions, engineering, product, support, business operations to identify appropriate detection use cases and automation
  • Apply a threat modeling centric approach to incident detection and response
  • Maintain security logging platform
  • Stay up to date with the latest threats, attack vectors to improve our detection mechanisms and attack surface management
  • Handle information security events and incidents across the ClickHouse products and services
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Incident Response Security Engineer

The Security Team is responsible for providing key security capabilities coverin...
Location
Location
United States
Salary
Salary:
169150.00 - 225000.00 USD / Year
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Background in product security / red teaming / penetration testing / threat modeling, combined with incident detection and response experience
  • Strong knowledge of and experience with one or more cloud service providers (e.g. AWS, GCP, Azure)
  • Excellent written and verbal communication skills
  • Experience securing large-scale customer-facing cloud infrastructures
  • Significant development and automation experience
  • preference for Golang and Python
Job Responsibility
Job Responsibility
  • Develop processes, tooling and automation to scale incident management response and mitigate risks to the business
  • Collaborate with other security functions, engineering, product, support, business operations to identify appropriate detection use cases and automation
  • Apply a threat modeling centric approach to incident detection and response
  • Maintain security logging platform
  • Stay up to date with the latest threats, attack vectors to improve our detection mechanisms and attack surface management
  • Handle information security events and incidents across the ClickHouse products and services
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
  • Fulltime
Read More
Arrow Right

Incident Response Security Engineer

The Security Team is responsible for providing key security capabilities coverin...
Location
Location
Netherlands
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Background in product security / red teaming / penetration testing / threat modeling, combined with incident detection and response experience
  • Strong knowledge of and experience with one or more cloud service providers (e.g. AWS, GCP, Azure)
  • Excellent written and verbal communication skills
  • Experience securing large-scale customer-facing cloud infrastructures
  • Significant development and automation experience
  • preference for Golang and Python
Job Responsibility
Job Responsibility
  • Develop processes, tooling and automation to scale incident management response and mitigate risks to the business
  • Collaborate with other security functions, engineering, product, support, business operations to identify appropriate detection use cases and automation
  • Apply a threat modeling centric approach to incident detection and response
  • Maintain security logging platform
  • Stay up to date with the latest threats, attack vectors to improve our detection mechanisms and attack surface management
  • Handle information security events and incidents across the ClickHouse products and services
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Incident Response Security Engineer

The Security Team is responsible for providing key security capabilities coverin...
Location
Location
United Kingdom
Salary
Salary:
Not provided
clickhouse.com Logo
ClickHouse
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Background in product security / red teaming / penetration testing / threat modeling, combined with incident detection and response experience
  • Strong knowledge of and experience with one or more cloud service providers (e.g. AWS, GCP, Azure)
  • Excellent written and verbal communication skills
  • Experience securing large-scale customer-facing cloud infrastructures
  • Significant development and automation experience
  • preference for Golang and Python
Job Responsibility
Job Responsibility
  • Develop processes, tooling and automation to scale incident management response and mitigate risks to the business
  • Collaborate with other security functions, engineering, product, support, business operations to identify appropriate detection use cases and automation
  • Apply a threat modeling centric approach to incident detection and response
  • Maintain security logging platform
  • Stay up to date with the latest threats, attack vectors to improve our detection mechanisms and attack surface management
  • Handle information security events and incidents across the ClickHouse products and services
What we offer
What we offer
  • Flexible work environment - ClickHouse is a globally distributed company and remote-friendly. We currently operate in 20 countries
  • Healthcare - Employer contributions towards your healthcare
  • Equity in the company - Every new team member who joins our company receives stock options
  • Time off - Flexible time off in the US, generous entitlement in other countries
  • A $500 Home office setup if you’re a remote employee
  • Global Gatherings – We believe in the power of in-person connection and offer opportunities to engage with colleagues at company-wide offsites
Read More
Arrow Right

Production Support Engineer

The role involves responsibility for incident management, automation in operatio...
Location
Location
India , Chennai; Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep understanding of incident response, recovery processes, and engineering operations in enterprise environments & the related KPIs
  • champion automation initiatives including scripting for automated health checks, alerting, and remediation
  • work closely with development, infrastructure, and business teams ensuring seamless communication and collaboration
  • core technical skills in operating systems (Linux – Rhel), databases (Oracle, Mongo DB), middleware/application layers (websphere, ngnix, tomcat), message queues (IBM MQ, Kafka)
  • strong scripting and automation skills (e.g., python, shell scripting) and experience in AI/ML as added advantage
  • 2-5 years of relevant experience
  • experience working in Financial Services or a large complex and/or global environment
  • project management experience
  • comprehensive knowledge of design metrics, analytics tools, benchmarking activities and related reporting
  • ability to work under pressure and manage to tight deadlines or unexpected changes
Job Responsibility
Job Responsibility
  • Deep understanding of incident response, recovery processes, and engineering operations in enterprise environments & related KPIs
  • champion automation initiatives to streamline tasks and reduce manual intervention
  • work closely with development, infrastructure, and business teams to ensure seamless communication
  • participate in post-mortem analysis and contribute to continuous improvement efforts.
What we offer
What we offer
  • Resources to meet unique needs
  • empower healthy decision-making
  • manage financial well-being
  • plan for future.
  • Fulltime
Read More
Arrow Right

Tech Support Engineer

We are looking for a Tech Support Engineer to be the first point of contact for ...
Location
Location
South Korea , Seoul
Salary
Salary:
Not provided
https://feverup.com/fe Logo
Fever
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in technical support or troubleshooting roles
  • Familiarity with monitoring and logging tools (e.g., Datadog, Grafana, Kibana, ...)
  • Knowledge of APIs, services, and client-server architecture
  • Understanding of incident management, triage, and escalation
  • Proficiency in English for clear communication
Job Responsibility
Job Responsibility
  • Diagnose issues, determine if they are bugs, failures, or misconfigurations, and escalate when necessary
  • Work closely with engineering, product, and QA teams to ensure smooth operations
  • Learn the platform’s features, architecture, and configuration
  • Shadow team members to understand incident handling and prioritization
  • Take ownership of simple incidents with guidance
  • Meet key teams: Engineering, Product, Data and Systems to understand collaboration workflows
  • Handle most incidents independently, using logs and monitoring tools
  • Determine issue causes and escalate complex cases appropriately
  • Improve internal documentation and support processes
  • Assist users with technical issues
What we offer
What we offer
  • Opportunity to have a real impact in a high-growth global category leader
  • 40% discount on all Fever events and experiences
  • Home office friendly anywhere in South Korea
  • Responsibility from day one and professional and personal growth
  • Great work environment with a young, international team of talented people to work with
  • English Lessons
  • Attractive compensation package consisting of base salary and the potential to earn a significant bonus for top performance (Including Base, Variable, and Stock Options)
  • Fulltime
Read More
Arrow Right

Principal Security Engineer

We’re building a world-class global Security team as part of our Trust Program. ...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
highspot.com Logo
Highspot
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of robust, progressive experience in security engineering, application security, DevSecOps, incident detection and response, or closely related fields
  • Advanced proficiency in at least one programming language (Python, Ruby, Go, Rust, JavaScript), with deep experience conducting detailed code reviews and security assessments across multiple languages
  • Hands-on experience with deploying, operating, and interpreting results from security tools such as static analyzers, web vulnerability scanners, supply chain analysis scanners, and host-based intrusion detection systems
  • Demonstrated experience mentoring, coaching and guiding junior and mid-level security engineers, contributing to a strong team culture, and supporting peer development as a senior individual contributor
  • Demonstrated proactive approach, strong continuous learning orientation, and curiosity about emerging threats, security trends, and innovative technologies
  • Extensive expertise securing cloud-native environments (AWS, Azure, GCP, containers, microservices), with in-depth knowledge of modern cloud security risks and defenses
  • Demonstrated ability to embrace being wrong, practice humility, continuously learn from experiences, and actively seek insights through thoughtful questioning and collaboration
Job Responsibility
Job Responsibility
  • Lead comprehensive application security assessments, advanced threat modeling sessions, and secure code reviews across critical product features, internal tooling, endpoints, and third-party integrations
  • Collaborate strategically with product engineering to establish and enhance secure-by-default and privacy-by-design practices within the software development lifecycle (SDLC)
  • Lead and otherwise participate in incident detection, investigation, triage, containment, and root cause analysis for high impact security incidents, providing mentorship and guidance to junior engineers as required
  • Drive the development and continuous improvement of sophisticated detection rules, response automation, and optimized alert management across cloud environments, corporate infrastructure, and SaaS platforms
  • Lead and participate in complex vulnerability remediation processes, and effectively respond to security issues discovered by both internal teams and external sources
  • Document technical findings and strategic decisions in a clear and accessible manner, and procedural enhancements
  • significantly contribute to comprehensive security playbooks and knowledge repositories
  • Manage and oversee asksecurity@ request handling, and actively participate in sprint-based security activities, balancing strategic and tactical execution
  • Actively participate in the security on-call rotation, or provide senior-level guidance as required during an event and aid in rapid response capabilities to protect our 24x7 platform and global workforce
  • Fulltime
Read More
Arrow Right