This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As an Observability Engineer you will be part of a team that is responsible for managing and operate our observability stack, ensuring E2E monitoring, metrics collection, logging and tracing across our infrastructure and applications. You will collaborate with other stakeholders to improve system visibility, detect issues proactively and drive performance optimization. This role requires strong technical expertise and hands-on experience with infrastructure as code (IaC) tools such as Terraform and application analysis.
Job Responsibility:
Observability Stack Design & Deployment: Design and Implement: build a robust observability stack encompassing logging, metrics collection, monitoring, alerting, and tracing systems tailored for cloud environments
Integration: seamlessly connect observability tools with cloud services and infrastructure to achieve comprehensive monitoring and visibility
IaC Development: use Terraform to automate the provisioning and deployment of observability tools and infrastructure, ensuring consistency and efficiency
Monitoring and Optimization: Monitoring Standards: define and enforce organization-wide monitoring and alerting standards for real-time incident detection
Optimization: continuously refine the observability stack to enhance system performance, minimize downtime, and optimize resource utilization
End-to-End Monitoring Practices: Comprehensive Tracking: implement end-to-end monitoring solutions that provide insights into the performance, availability, and reliability of IT workloads
Standardization: establish best practices for metrics, logs, and traces, ensuring holistic visibility across the technology stack
Automated Alerts: develop automated alerting systems for proactive issue identification and resolution
Technical Collaboration: Cross-Team Integration: work closely with DevOps, SRE, and application development teams to align observability strategies with operational objectives
Bachelor’s degree in Computer Science, Engineering, or a related field or equivalent experience may be considered
5+ years of experience in observability, for on-prem and cloud infrastructure, or related fields, with at least 2 years in a leadership or tech lead role
Proven experience in infrastructure troubleshooting and infrastructure as code (IaC) tools with Terraform, Phyton and similar
Cloud Platforms: expertise in Oracle OCI is a plus, with knowledge of AWS, Azure, and GCP observability features
Observability Tools: skilled in industry standards like Prometheus, Alert Manager, Grafana, Loki, PagerDuty and similar tools
Best Practices: deep understanding of monitoring, logging, and tracing within cloud-native environments
Containerized Platforms: proficient in OpenShift, Kubernetes, and related container platforms
CI/CD & Automation: experienced with CI/CD pipelines and automation tools like Jenkins and GitLab/GitHub
Good problem-solving skills with a focus on delivering high-quality, scalable solutions
Effective communication skills, both written and verbal, with the ability to convey complex technical concepts to technical and non-technical stakeholders
Ability to work in a fast-paced, collaborative environment with changing priorities
What we offer:
Annual bonus
Flexible working
Instant recognition scheme
Access to Udemy for professional and personal learning