This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Data Engineer plays a critical role in designing, building, and maintaining high-quality data pipelines on the Databricks platform. This position centers on the ingestion, transformation, and processing of large-scale datasets to derive meaningful business insights. Success in the role requires writing efficient code, primarily in Python and SQL, while leveraging tools such as PySpark and Delta Lake. Integration with cloud services, especially Azure, is a key component. The Data Engineer is expected to uphold the highest standards of data quality, security, and performance, and to work collaboratively with both technical and non-technical stakeholders, translating business requirements into actionable, data-driven solutions.
Job Responsibility:
Design, build, and maintain scalable data pipelines for both batch and streaming data, sourced from a variety of systems. The primary technologies used in these processes are PySpark and Databricks SQL
Maintain high standards of data quality, integrity, and security throughout every stage of the data lifecycle
Track, monitor and report on platform compute costs and escalate any unexpected anomalies
Tune and optimize Databricks jobs and Spark configurations to enhance both performance and cost efficiency
Integrate Databricks with other cloud services for storage, compute, and security, with a particular focus on Azure Data Lake Storage
Work in close partnership with cross-functional teams—including data scientists, analysts, and business stakeholders—to understand requirements and deliver data-driven solutions tailored to their needs
Monitor the performance of data pipelines, troubleshoot issues as they arise, and provide support for user requests within the Databricks environment
Implement and enforce best practices for data governance, security, and compliance in all aspects of data engineering activities
Requirements:
3 - 5 years experience
Proficiency in data engineering principles, including the development and maintenance of data pipelines
Advanced coding skills in Python, SQL, and Scala, with significant experience working with Apache Spark
Hands-on experience with the Databricks platform, particularly with Delta Lake, Databricks Runtime, and Databricks Workflows
Familiarity with the Azure Cloud platform
Knowledge of the Gold Medallion architecture
Experience in data ingestion, transformation, and loading processes (ETL/ELT)
Excellent communication skills, with the ability to explain complex data concepts to both technical and non-technical audiences
Strong problem-solving and analytical abilities
Experience with Mulesoft API platform is considered an asset
Background in creating ingestion pipelines from a variety of systems, such as HRIS, ERP, CRM, Microsoft SQL Server, and Apache Kafka
Experience with machine learning and data analytics
Knowledge of data governance and security best practices
Databricks certifications an asset
Knowledge or experience in developing and integrating custom machine learning models using Azure Machine Learning, MLflow, and other relevant libraries
Nice to have:
Experience with Mulesoft API platform is considered an asset