This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for an experienced Data Engineer to join our data platform team to support the migration of a legacy Data Lake architecture to a modern Lakehouse architecture. The role involves designing and building scalable data pipelines using Apache Spark, Refiner frameworks, and other AWS cloud-native data engineering tools, while integrating with Snowflake for advanced analytics and data warehousing. The ideal candidate will have strong experience in distributed data processing, cloud data platforms, and large-scale data migration projects.
Job Responsibility:
Design and implement data pipelines to migrate data from existing Data Lake to Lakehouse architecture
Develop and optimize Spark-based ETL/ELT pipelines using Spark Refiner or similar transformation frameworks
Build scalable data processing workflows using AWS services such as S3, Glue, EMR, Lambda, and Step Functions
Integrate and manage data ingestion into Snowflake for analytics and downstream consumption
Perform data modelling for Lakehouse architecture (Bronze, Silver, Gold layers)
Ensure data quality, governance, and lineage across the migration process
Optimize performance of Spark jobs and Snowflake queries for large-scale datasets
Work closely with data architects, analytics teams, and business stakeholders to ensure reliable data delivery
Implement CI/CD pipelines for data engineering workflows
Support data validation, reconciliation, and testing during migration
Requirements:
Strong experience with Apache Spark (PySpark / Scala)
Experience with Spark Refiner or similar transformation frameworks
Hands-on expertise with AWS Data Ecosystem, including: S3, Glue, EMR, Lambda, Step Functions, IAM
Experience building large-scale ETL/ELT pipelines
Strong knowledge of Snowflake, including: Snowpipe, Data loading, Performance optimization
Excellent SQL and data modelling skills
Understanding of Data Lake, Lakehouse architectures, and modern data storage formats
Knowledge of data migration strategies and validation techniques
Nice to have:
Experience with Delta Lake, Iceberg, or Hudi
Familiarity with Airflow or other workflow orchestration tools
Knowledge of DevOps practices and CI/CD tools
Experience with data governance and catalog tooling
Exposure to streaming platforms such as Kafka or Kinesis