This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for an experienced Senior Data Engineer to design, build, and optimize modern, cloud-based data platforms that power analytics, AI, and data products across the organization. You will work on scalable batch, streaming, and near-real-time pipelines, enabling high-quality, curated datasets while ensuring robust data governance, security, and observability across the data ecosystem. You will also play a key role in supporting AI and GenAI systems, enabling pipelines for machine learning, causal modeling, and LLM-powered applications such as RAG and agent-based systems.
Job Responsibility
Design and implement scalable data platforms and pipelines across cloud environments using technologies such as Spark and Delta Lake
Build ingestion, transformation, and curation workflows for both structured and unstructured data
Implement modern data architectures including lakehouse patterns and medallion layering
Deliver high-quality datasets that support analytics, machine learning, causal modeling, and optimization systems
Enable data pipelines for GenAI use cases including LLMs, RAG pipelines, and vector-based data flows
Design scalable logical and physical data models for analytical and operational use cases
Orchestrate workflows using tools such as Airflow, dbt, Lakeflow, or equivalents
Apply modern architecture patterns including event-driven and streaming architectures
Ensure adherence to best practices in data governance, lineage, quality, and access control
Establish strong data observability
Enable data serving layers to support downstream systems
Continuously monitor and optimize pipelines and infrastructure for performance, scalability, and cost efficiency
Work closely with data scientists, ML engineers, analysts, and business stakeholders
Requirements
Strong hands-on experience with Apache Spark and Delta Lake
Strong programming skills in Python and SQL
Proven experience building batch and streaming data pipelines and production-grade data platforms
Solid understanding of data modeling, data quality, and governance principles
Experience with one or more major cloud platforms, with preference for Microsoft Azure / Fabric, as well as AWS or GCP
Familiarity with modern data platforms such as Databricks and Snowflake
Experience with lakehouse architectures and distributed data systems
Strong understanding of scalability, reliability, and performance considerations in data pipelines
Strong problem-solving skills focused on scalability and reliability
Collaborative approach to working in cross-functional teams
Experience in Agile or consulting environments is beneficial
Nice to have
Experience with GenAI and AI data systems (e.g., RAG pipelines, vector databases, LLM data preparation)
CI/CD for data pipelines and infrastructure-as-code tools such as Terraform, ARM, or CloudFormation
Additional exposure to streaming technologies (e.g., Kafka)
Spark optimization
Advanced analytics and ML workloads (including causal or experimentation platforms)
Experience building data products or large-scale analytics platforms
What we offer
Flexibility with hybrid work options and 25 vacation days
Co-subsidized transportation & Multisport cards
Premium health insurance
Training policy for technical and other skills-related events, courses and certifications
Personal career development roadmap guided by performance evaluations
Self-care program offering psychological consultations & discussions