This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for an Engineering Manager to join the OREO (Observability Reliability Engineering Obsession) team in Platform Engineering. As an Engineering Manager, your mission will be to lead the Reliability & Observability team and drive the evolution of Doctolib's observability platform, supporting the exponential growth of Doctolib services while building and empowering a world-class SRE team. Working in the tech team at Doctolib involves building innovative products and features to improve the daily lives of care teams and patients. We work in feature teams in an agile environment, while collaborating with product, design, and business teams. You will lead a team of Site Reliability Engineers who are responsible for shaping Doctolib's observability strategy and ensuring our platform remains reliable, debuggable, and scalable. This role sits at the intersection of people management, technical leadership, and strategic planning with a particular focus on building organizational capabilities around logging, metrics, tracing, and alerting. Your team also owns and operates critical transversal services that enable secure, scalable infrastructure management across the organization, including HashiCorp Vault for secrets management and Terraform Enterprise for infrastructure as code.
Job Responsibility:
Lead, coach, and grow a team of Site Reliability Engineers, supporting their technical development and career progression
Create a culture of operational excellence, continuous improvement, and psychological safety within the team
Conduct regular 1:1s, performance reviews, and career development conversations
Recruit, onboard, and retain top SRE talent aligned with Doctolib's mission and values
Partner with SREs and senior engineers to define and evolve the observability strategy across the platform, focusing on logging, metrics, tracing, and alerting
Own the strategy and evolution of critical transversal services including HashiCorp Vault and Terraform Enterprise
Drive prioritization and roadmap planning for large-scale reliability and observability initiatives
Ensure alignment between team objectives and broader engineering and business goals
Advocate for and allocate resources toward reducing technical debt and improving developer experience
Own the team's on-call experience and contribute to the incident response processes, ensuring sustainable practices and continuous improvement
Ensure high availability and reliability of transversal services that are critical to the entire engineering organization
Lead postmortem reviews and drive systemic improvements to prevent recurring issues
Work closely with Product Managers, Engineering Managers, and architects to align observability capabilities with product and platform needs
Partner with security and infrastructure teams to evolve secrets management and IaC practices across the organization
Represent the OREO team in engineering leadership forums, architectural reviews, and strategic planning sessions
Foster strong partnerships with software engineering teams to improve instrumentation quality and adoption of observability best practices
Requirements:
At least 5+ years of software engineering or SRE experience, with a strong technical background in cloud-native environments (preferably AWS, GCP, and/or Kubernetes-based)
3+ years of engineering management experience, leading technical teams (ideally SRE, platform, or infrastructure teams)
Deep understanding of observability tooling and architecture (Fluent Bit, OpenTelemetry, Loki, Elasticsearch, Prometheus, Thanos, Datadog)
Experience with infrastructure as code (Terraform, OpenTofu) and secrets management systems (Vault, AWS Secrets Manager)
Proven ability to balance technical depth with people leadership, able to mentor engineers, review technical designs, and guide architectural decisions
Nice to have:
Experience scaling SRE or platform teams in fast-growing, high-traffic environments
Background in designing and operating high-scale telemetry pipelines
Hands-on experience with HashiCorp Vault and Terraform Enterprise in production environments
Hands-on experience with backend programming languages (e.g., Go, Python, Ruby)
Experience driving cultural and technical transformations
What we offer:
Free comprehensive health insurance for you and your children
Parent Care Program: receive one additional month of leave on top of the legal parental leave
Free mental health and coaching services through our partner Moka.care
For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
Work Council subsidy to refund part of sport club membership or creative class
Up to 14 days of RTT
A subsidy from the work council to refund part of the membership to a sport club or a creative class