This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Main Purpose: Collaborate with data scientists and business stakeholders to design, develop, and maintain efficient data pipelines feeding into the organization's data lake. Maintain the integrity and quality of the data lake, enabling accurate and actionable insights for data scientists and informed decision-making for business stakeholders. Utilize extensive knowledge of data engineering and cloud technologies to enhance the organization’s data infrastructure, promoting a culture of data-driven decision-making. Apply data engineering expertise to define and optimize data pipelines using advanced concepts to improve the efficiency and accessibility of data storage. Own the development of an extensive data catalog, ensuring robust data governance and facilitating effective data access and utilization across the organization.
Job Responsibility:
Collaborate with data scientists and business stakeholders to design, develop, and maintain efficient data pipelines feeding into the organization's data lake
Maintain the integrity and quality of the data lake, enabling accurate and actionable insights for data scientists and informed decision-making for business stakeholders
Utilize extensive knowledge of data engineering and cloud technologies to enhance the organization’s data infrastructure, promoting a culture of data-driven decision-making
Apply data engineering expertise to define and optimize data pipelines using advanced concepts to improve the efficiency and accessibility of data storage
Own the development of an extensive data catalog, ensuring robust data governance and facilitating effective data access and utilization across the organization
Collaborate with stakeholders (data scientists, analysts, product teams) to translate business requirements into Databricks-native data solutions
Requirements:
Contribute to the development of scalable and performant data pipelines on Databricks, leveraging Delta Lake, Delta Live Tables (DLT), and other core Databricks components
Develop data lakes/warehouses designed for optimized storage, querying, and real-time updates using Delta Lake
Implement effective data ingestion strategies from various sources (streaming, batch, API-based), ensuring seamless integration with Databricks
Ensure the integrity, security, quality, and governance of data across our Databricks-centric platforms
Build and maintain ETL/ELT processes, heavily utilizing Databricks, Spark (Scala or Python), SQL, and Delta Lake for transformations
Experience with CI/CD and DevOps practices specifically tailored for the Databricks environment
Monitor and optimize the cost-efficiency of data operations on Databricks, ensuring optimal resource utilization
Utilize a range of Databricks tools, including the Databricks CLI and REST API, alongside Apache Spark™, to develop, manage, and optimize data engineering solutions