This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
To develop, implement, and optimize complex Data Warehouse (DWH) and Data Lakehouse solutions using the Databricks platform to ensure a scalable, high-performance, and governed data foundation for analytics, reporting, and Machine Learning.
Job Responsibility:
Design and implement robust, scalable, and high-performance ETL/ELT data pipelines using PySpark/Scala and Databricks SQL on the Databricks platform
Expertise in implementing and optimizing the Medallion architecture (Bronze, Silver, Gold) using Delta Lake to ensure data quality, consistency, and historical tracking
Efficient implementation of the Lakehouse architecture on Databricks, combining best practices from DWH and Data Lake
Optimize Databricks clusters, Spark operations, and Delta tables to reduce latency and computational costs
Design and implement real-time/near-real-time data processing solutions using Spark Structured Streaming and Delta Live Tables
Implement and manage Unity Catalog for centralized data governance, data security and data lineage
Define and implement data quality standards and rules to maintain data integrity
Develop and manage complex workflows using Databricks Workflows or external tools to automate pipelines
Integrate Databricks pipelines into CI/CD processes
Work closely with Data Scientists, Analysts, and Architects to understand business requirements and deliver optimal technical solutions
Provide technical guidance and mentorship to junior developers and promote best practices.
Requirements:
Proven, expert-level experience with the entire Databricks ecosystem (Workspace, Cluster Management, Notebooks, Databricks SQL)
In-depth knowledge of Spark architecture (RDD, DataFrames, Spark SQL) and advanced optimization techniques
Expertise in implementing and managing Delta Lake (ACID properties, Time Travel, Merge, Optimize, Vacuum)
Advanced/expert-level proficiency in Python (with PySpark) and/or Scala (with Spark)
Advanced/expert-level skills in SQL and Data Modeling (Dimensional, 3NF, Data Vault)
Solid experience with a major Cloud platform (AWS, Azure, or GCP), especially with storage services (S3, ADLS Gen2, GCS) and networking.
Nice to have:
Hands-on experience with implementing and managing Unity Catalog
Experience with Delta Live Tables (DLT) and Databricks Workflows
Understanding of basic MLOps concepts and experience with MLflow to facilitate integration with Data Science teams
Experience with Terraform or equivalent tools for Infrastructure as Code (IaC)
Databricks certifications (e.g., Databricks Certified Data Engineer Professional).
What we offer:
Full access to foreign language learning platform
Personalized access to tech learning platforms
Tailored workshops and trainings to sustain your growth
Medical insurance
Meal tickets
Monthly budget to allocate on flexible benefit platform
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.