This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Data Engineer will play a crucial role in migrating data from on-prem DataLake to AWS LakeHouse. This position requires a minimum of 3-5 years of experience in data engineering, with strong skills in Python and SQL. The candidate will engage with stakeholders to ensure data integrity and will be responsible for translating legacy data patterns for compatibility with modern tools. A Bachelor’s or Master’s degree in a relevant field is required.
Job Responsibility:
Perform end-to-end datastore migration from on-prem DataLake to AWS LakeHouse
Pipeline Migration including refactoring and migrating extraction logic and job scheduling
Data Transfer executing physical migration of underlying datasets while ensuring data integrity
Stakeholder Engagement acting as technical liaison to internal clients
Consumption Pattern Migration including code conversion translating and optimizing legacy SQL and Spark-based consumption patterns for Snowflake and Iceberg
Usage analysis to understand usage patterns and deliver required data products
Data Reconciliation and Quality working with reconciliation frameworks to ensure data equivalence
Work with internal data management platform and learn new workflows
Requirements:
Bachelor’s or Master’s degree in Computer Science, Applied Mathematics, Engineering, or a related quantitative field
Minimum of 3-5 years of professional hands-on-keyboard coding experience in a collaborative, team-based environment
Ability to troubleshoot SQL and basic scripting experience
Professional proficiency in Python or Java
Deep familiarity with the full Software Development Life Cycle (SDLC) and CI/CD best practices and K8s deployment experience
Sophisticated understanding of Temporal Data Modeling, Schema Management, Performance Optimization, Architectural Theory
Kafka, ANSI SQL, FTP, Apache Spark
JSON, Avro, Parquet
Hadoop (HDFS/Hive), Snowflake, Apache Iceberg, Sybase IQ