This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We’re looking for a Data Reliability Engineer to help keep our trading and data platforms running seamlessly. In this role, you’ll be the guardian of our data pipelines — ensuring that trading critical Airflow workflows and Python-based jobs run smoothly, on time, and with precision. You’ll dive deep into incidents when they happen, diagnose issues quickly, and make the fixes that keep downstream systems healthy and reliable. Your work will directly support the speed and stability our trading teams depend on every day.
Job Responsibility:
Ensure Platform Reliability - Monitor and maintain trading-critical Airflow DAGs and Python-based pipelines, ensuring jobs run on time and within SLAs
validate downstream impacts and maintain tested rollback/recovery procedures
Change & Release Management - Act as a release gatekeeper—review code/config changes, enforce safe deployment standards, and coordinate risk-aware releases via Git(lab) and Octopus Deploy
Collaboration & Communication - Partner with quants and engineers to assess change impacts, document runbooks, and communicate operational updates and risks
Continuous Improvement - Enhance monitoring, alerting, and automation
track KPIs and drive initiatives that strengthen platform resilience and reduce incident recurrence
Requirements:
Degree in a technical or business discipline or equivalent industry experience of 1+ years
Demonstrated experience with Python or equivalent language
Excellent analytical & troubleshooting skills, self-motivated and curious
Willing to work shift hours, to cover early and late responsibilities (alternating)
Experience with Change Management, Incident Management Procedures
Experience of technical documentation & support cases
Nice to have:
Financial Trade Floor experience is a plus but not essential, training will be provided
Knowledge of system monitoring tools such as Check_MK, Splunk, ELK
Awareness of modern distributed application systems and the glue that binds them - messaging and database systems