This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking an Advanced Data Engineer with strong expertise in the Databricks ecosystem to join our data engineering team. The ideal candidate will be responsible for designing, developing, and optimizing robust data pipelines and frameworks that support data analytics, machine learning, and reporting initiatives. You will play a key role in ensuring data governance, observability, and automation within a modern data stack. In addition, the role requires strong skills in SQL, DBX (Databricks), PySpark, Data Engineering fundamentals, and experience with workflow orchestration tools such as Apache Airflow.
Job Responsibility
Understand, analyze, and contribute to the current Databricks architecture and design principles, ensuring scalability and performance
Develop and maintain efficient data processing scripts using Python and PySpark, ensuring clean, reusable, and scalable code
Demonstrate a deep understanding of datasets, including structure, lineage, semantics, and business context
Use GitHub for version control and collaborate effectively using GitHub Actions for automating workflows and CI/CD pipelines
Configure and maintain CI/CD pipelines in a DevOps environment for seamless code integration and deployment
Leverage AI coding assistants like GitHub Copilot and Databricks Assistant to improve development efficiency and code quality
Collaborate with cross-functional teams including data scientists, analysts, and platform engineers
Utilize advanced SQL for data transformation, analysis, and troubleshooting across large-scale datasets
Apply strong data engineering principles to design, optimize, and maintain scalable ETL/ELT processes
Build and manage data workflows using Apache Airflow or similar orchestration tools to ensure reliable automation and scheduling
Work extensively within the DBX (Databricks) environment to develop scalable pipelines and enforce best practices across the platform
Requirements
5+ years of experience in data engineering or related roles
Proficient in Python and PySpark, with a strong foundation in distributed data processing
Hands-on experience working with Databricks (DBX), including workspace administration and Unity Catalog integration
Strong understanding of data security and governance best practices
Proficiency in SQL, including complex queries, optimization, and performance tuning
Experience with monitoring tools such as Datadog for data system observability
Proficiency in Git/GitHub, including pull requests, branching strategies, and GitHub Actions
Experience with DevOps practices related to CI/CD, especially in data pipeline deployments
Familiarity with AI-powered coding tools such as GitHub Copilot and Databricks Assistant
Strong problem-solving skills and ability to work in a fast-paced, collaborative environment
Experience in workflow orchestration, preferably with Apache Airflow
Nice to have
Databricks or Azure certifications are a plus
Experience in cloud platforms (Azure) in a data engineering context
Familiarity with modern data stack tools and frameworks
Excellent communication and documentation skills
What we offer
medical
vision
dental
life
disability insurance
paid time off (including holidays, parental leave, and sick leave, as required by law)