This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Job Description: Responsibilities: Develop & Optimize Data Pipelines Build, test, and maintain ETL/ELT data pipelines using Azure Databricks & Apache Spark (PySpark). Optimize performance and cost-efficiency of Spark jobs. Ensure data quality through validation, monitoring, and alerting mechanisms. Understand cluster types, configuration, and use-case for serverless Implement Unity Catalog for Data Governance Design and enforce access control policies using Unity Catalog. Manage data lineage, auditing, and metadata governance. Enable secure data sharing across teams and external stakeholders. Integrate with Cloud Data Platforms Work with Azure Data Lake Storage / Azure Blob Storage/ Azure Event Hub to integrate Databricks with cloud-based data lakes, data warehouses, and event streams. Implement Delta Lake for scalable, ACID-compliant storage. Automate & Orchestrate Workflows Develop CI/CD pipelines for data workflows using Azure Databricks Workflows or Azure Data Factory. Monitor and troubleshoot failures in job execution and cluster performance. Collaborate with Stakeholders Work with Data Analysts, Scientists, and Business Teams to understand requirements. Translate business needs into scalable data engineering solutions. API expertise Ability to pull data from a wide variety of APIs using different strategies and methods
Job Responsibility:
Develop & Optimize Data Pipelines
Build, test, and maintain ETL/ELT data pipelines using Azure Databricks & Apache Spark (PySpark)
Optimize performance and cost-efficiency of Spark jobs
Ensure data quality through validation, monitoring, and alerting mechanisms
Understand cluster types, configuration, and use-case for serverless
Implement Unity Catalog for Data Governance
Design and enforce access control policies using Unity Catalog
Manage data lineage, auditing, and metadata governance
Enable secure data sharing across teams and external stakeholders
Integrate with Cloud Data Platforms
Work with Azure Data Lake Storage / Azure Blob Storage/ Azure Event Hub to integrate Databricks with cloud-based data lakes, data warehouses, and event streams
Implement Delta Lake for scalable, ACID-compliant storage
Automate & Orchestrate Workflows
Develop CI/CD pipelines for data workflows using Azure Databricks Workflows or Azure Data Factory
Monitor and troubleshoot failures in job execution and cluster performance
Collaborate with Stakeholders
Work with Data Analysts, Scientists, and Business Teams to understand requirements
Translate business needs into scalable data engineering solutions
API expertise
Ability to pull data from a wide variety of APIs using different strategies and methods
Requirements:
Azure Databricks & Apache Spark (PySpark) – Strong experience in building distributed data pipelines
Python – Proficiency in writing optimized and maintainable Python code for data engineering
Unity Catalog – Hands-on experience implementing data governance, access controls, and lineage tracking
SQL – Strong knowledge of SQL for data transformations and optimizations
Delta Lake – Understanding of time travel, schema evolution, and performance tuning
Workflow Orchestration – Experience with Azure Databricks Jobs or Azure Data Factory
CI/CD & Infrastructure as Code (IaC) – Familiarity with Databricks CLI, Databricks DABs, and DevOps principles
Security & Compliance – Knowledge of IAM, role-based access control (RBAC), and encryption
Nice to have:
Experience with MLflow for model tracking & deployment in Databricks
Familiarity with streaming technologies (Kafka, Delta Live Tables, Azure Event Hub, Azure Event Grid)
Hands-on experience with dbt (Data Build Tool) for modular ETL development
Certification in Databricks, Azure is a plus
Experience with Azure Databricks Lakehouse connectors for SalesForce and SQL Server
Experience with Azure Synapse Link for Dynamics, dataverse
Familiarity with other data pipeline strategies, like Azure Functions, Fabric, ADF, etc