This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The SRE Intern will join the Platform Team to discover and contribute to the infrastructure and systems that ensure the reliability, performance, and security of our production environments. Under the mentorship of experienced SRE engineers, this internship bridges learning and hands-on contribution, applying software engineering principles to real infrastructure and operational challenges. This role involves close collaboration with the SRE team, Development teams, and other stakeholders to contribute to automation, observability improvements, and infrastructure-as-code practices. The intern will progressively gain autonomy on well-scoped projects while learning incident management, capacity planning, and reliability engineering fundamentals in a production context.
Job Responsibility:
Participate alongside Development teams in infrastructure discussions, deployment processes, and operational requirements
Contribute to monitoring, alerting, and observability improvements (dashboards, alerts, log hygiene)
Write and review Terraform / Terragrunt modules under supervision, learning Infrastructure-as-Code best practices
Contribute to disaster recovery documentation and backup verification procedures
Shadow and progressively contribute to incident response efforts, learning root cause analysis methodology
Develop and improve runbooks and documentation for operational procedures
Help ensure proper logging and monitoring coverage across systems
Contribute to automation initiatives to reduce manual operations (scripts, tooling, pipeline improvements)
Learn and apply SRE practices (SLOs, error budgets, toil reduction) in day-to-day work
Work with development teams to understand and support operational readiness requirements
Collaborate with the SRE team on infrastructure security measures
Participate in knowledge sharing sessions and team rituals
Document learnings, contribute to the team’s knowledge base, and share findings with peers
Partner with team members to improve developer experience through tooling and documentation
Requirements:
Student in a Computer Science / Engineering program, looking for a 5-to-6-month internship (convention de stage required)
Solid fundamentals in systems
Familiarity or curiosity about AWS, Kubernetes, Terraform and Terragrunt, ArgoCD and CircleCI, OpenTelemetry & Datadog, GNU/Linux systems like Debian
Comfortable or eager to learn: Working with Linux/Unix systems, Understanding distributed systems fundamentals and cloud architectures, Writing scripts (Bash, Python or equivalent) to automate tasks, Learning incident response practices and structured troubleshooting, Working in both French and English, in a hybrid/remote context
Strong problem-solving skills and a methodical approach to understanding how systems work
Reliability-curious: genuinely interested in how production systems run, how failures happen, and how to build resilient infrastructure
Nice to have:
Having touched our tech stack (Ruby, Elixir, React.js) or contributed to personal/open-source infra projects is a significant advantage