This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a Data Engineer Engineer to support and enhance a Databricks‑based data platform during its development phase. This role is focused on building reliable, scalable data solutions early in the lifecycle—not production firefighting. The ideal candidate brings hands‑on experience with Databricks, PySpark, Python, and a working understanding of Azure cloud services. You will partner closely with Data Engineering teams to ensure pipelines, notebooks, and workflows are designed for long‑term scalability and production readiness.
Job Responsibility:
Develop and enhance Databricks notebooks, jobs, and workflows
Write and optimize PySpark and Python code for distributed data processing
Assist in designing scalable and reliable data pipelines
Apply Spark performance best practices: partitioning, caching, joins, file sizing
Work with Delta Lake tables, schemas, and data models
Perform data validation and quality checks during development cycles
Support cluster configuration, sizing, and tuning for development workloads
Identify performance bottlenecks early and recommend improvements
Partner with Data Engineers to prepare solutions for future production rollout
Document development standards, patterns, and best practices
Requirements:
3+ years of hands‑on experience working with Databricks & Spark
Strong hands‑on Databricks experience
Proficiency with PySpark and Spark fundamentals
Experience with Delta Lake
Understanding of Spark execution (jobs, stages, tasks, shuffles)
Strong Python development skills
Ability to write clean, modular, reusable code
Experience working in Microsoft Azure
Familiarity with: Azure Databricks, Azure Data Lake Storage (ADLS), Azure Blob Storage
Basic understanding of cloud resource usage and cost awareness
Experience using Git and version control workflows
Familiarity with Databricks Repos or similar tools
Ability to perform testing and validation of data pipelines