This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We’re hiring a Site Reliability Engineer to join the Infrastructure team. Reporting to our SRE Manager, you will be a member of Ledger’s SRE team driving technology's transformation by launching new platforms, building tools, automating away complex issues, and integrating with the latest technology. Site Reliability Engineers leverage their experience as software and systems engineers to ensure applications integrated by SRE are available, have full-stack observability and have continuous improvement through code and automation.
Job Responsibility:
Participate in building a DevOps / SRE culture and enable the transition to modern infrastructure management and deployment practices
Participate in building the SRE team roadmap (vision and delivery accountability)
while anticipating stakeholder needs, game-changing technologies emergence and challenge scope / deadlines
You will bring a strong mixture of software engineering, operations, and systems engineering experience to the role, and you have experience in the integration of complex systems
Perform integration of platform software components
Participate to design and deliver solutions to improve the availability, scalability, latency, and efficiency of systems
Influence and create standards & best practices in support of service level objectives
Automate key SRE metrics including SLOs/SLAs and error budgets
Provide expert support to our level-2/application support team, to troubleshoot priority incidents, and conduct post-mortems
Apply analytics on past incidents and usage patterns to predict issues and take proactive actions
Ensure control of technical debt and promote quality practices
Follow SRE and chaos engineering approaches across all strategic systems to predict in coordination with Service Design and prevent outages and improve solution availability
Design and conduct performance tests, identify the bottlenecks and opportunities for optimization
Requirements:
5+ years on cloud engineering at scale, on organizations operating SaaS solutions
Proficiency in working in Unix/Linux environments, Python, Terraform, Kubernetes, AWS cloud solutions and architectures, CI/CD tools, ArgoCD, Ansible, configuration management, Database management (postgres), API management etc.
Strong knowledge on observability practices, with experience implementing and managing Logging, Monitoring and Alerting framework with solutions such as Datadog or Prometheus/Grafana.
Experience of cross-functional work and the ability to demonstrate a collaborative approach with regards to building key relationships across the organization and define projects scope, goals, plan and deliverables
“Customer focused” with the ability to identify and understand both internal and external customer's needs
Creative problem-solving and analysis skills with an ability to identify develop and implement solutions to meet the needs of the business
Excellent presentation and written communication Ability to deal with ambiguity, high level of pressure and rapidly changing environments
What we offer:
Flexible work options - Our hybrid policy allows employees to work from home up to 3 times per week
Health & Wellness support - Health and Life Insurance.
Financial growth opportunities - Employees can become shareholders in Ledger as well as other financial benefits depending on your country of work.
Commuter allowance - Ledger offers a commuter allowance to contribute to your preferred means of transportation.
Learning & Development - A comprehensive suite of training solutions providing a personalised learning experience for every employee.