This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking an Apigee X Site Reliability Engineer (SRE) with strong production experience operating APIs at scale. This role is focused on ensuring the reliability, performance, and resilience of Apigee X–backed services that support critical customer journeys.
Job Responsibility:
Define, implement, and maintain Service Level Indicators (SLIs) and Service Level Objectives (SLOs) for Apigee-backed services
Design, implement, and continuously tune alerting strategies across the API platform
Own and enhance operational dashboards covering Golden Signals and dependency health using Datadog
Proactively identify anomalies and performance degradation trends
Produce weekly and monthly reliability reports
Implement and maintain synthetic monitoring and user journey checks for critical API flows
Participate in 24x7 on-call rotations and lead incident response and problem management activities
Requirements:
3-4yrs as reliability or production support professional
Strong hands-on expertise in the Apigee platform, particularly Apigee X
Proficient in custom reporting and advanced debugging within Apigee environments
Experienced with APM and observability tools, including creating dashboards, alerts, and monitors (Datadog preferred)
Comfortable operating in production environments and responding to incidents with a structured, customer-impact-focused approach
Knowledgeable in modern cloud technologies and distributed systems
Familiar with Agile ways of working and collaborative, cross-functional delivery
Bachelor’s degree level in Computer Science, Computer Engineering, or equivalent practical experience
What we offer:
Opportunity to work on large-scale, business-critical API platforms
Exposure to advanced reliability engineering practices within a global technology organisation
Collaboration with diverse, cross-functional teams across markets and partners
A role with clear ownership, influence, and measurable outcomes in platform reliability and resilience