This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Lead Integration & Observability Specialist will design and implement observability solutions for cloud-based integration platforms on AWS and Azure. The role requires over 7 years of IT experience, with a focus on monitoring, automation, and operational readiness. Candidates should have strong experience with observability tools and mentoring capabilities. This position offers opportunities for technical leadership and collaboration with cross-functional teams.
Job Responsibility:
Lead the implementation of enterprise observability for applications, APIs, services, batch jobs, and data pipelines
Design and standardize monitoring, alerting, logging, metrics, and health checks across distributed systems
Integrate observability platforms with incident management and automation tools to support proactive issue detection and remediation
Support reliability and availability of integration platforms built on AWS/Azure
Perform advanced troubleshooting using logs, metrics, and traces to resolve production issues
Define operational readiness standards and non-functional requirements
Mentor engineers on observability best practices and platform usage
Collaborate with product, support, and operations teams to improve service stability and delivery
Requirements:
7+ years of overall IT experience
5+ years of relevant experience in Observability / Monitoring / Reliability Engineering
Strong hands-on experience with enterprise observability tools, such as: IBM Instana, Dynatrace, AppDynamics, Prometheus, Grafana
Expertise in: Monitoring and alerting design
Log management and analysis
Metrics and distributed tracing
Health checks and SLO/SLI concepts
Experience monitoring AWS/Azure workloads
Strong troubleshooting and incident analysis skills
Experience defining operational and non-functional requirements
Technical leadership and mentoring experience
Automation and ITSM integration (ServiceNow workflows, incident automation)