This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Senior Data Engineer will design and oversee data pipelines in Databricks on AWS, manage integrations with SaaS platforms and implement robust data quality and observability frameworks. This role ensures reliable, high-performance data delivery for enterprise analytics and AI workloads.
Job Responsibility:
Design, build and maintain scalable ETL/ELT pipelines that ingest, transform and deliver trusted data for analytics and AI use cases
Build data integrations with well-known SaaS platforms such as Salesforce, NetSuite, Jira and others
Implement incremental and historical data processing to ensure accurate, up-to-date data sets
Ensure data quality, reliability and performance across pipelines through validation, testing and continuous code optimization
Contribute to data governance and security by supporting data lineage, metadata management and data access controls
Support production operations, including monitoring, alerting and troubleshooting
Work with stakeholders to translate business and technical requirements into well-structure, reliable datasets
Share knowledge and contribute to team standards, documentation and engineering best practices
Requirements:
Data Ingestion & Integration: hands-on experience building robust ingestion pipelines using tools and patterns such as Databricks Auto Loader, Lakeflow Connectors, Fivetran and/or custom API / file-based integrations
Core Data Engineering: strong development experience using SQL, Python and Apache Spark (PySpark) for large-scale data processing
Data Pipeline Orchestration: proven experience developing and operating data pipelines using Databricks Workflows & Jobs, Delta Live Tables (DLT) and/or Lakeflow Declarative Pipelines
Incremental Processing & Data Modelling: deep understanding of incremental data loading, including Change Data Capture (CDC), MERGE operations and Slowly Changing Dimensions (SCD) in a Lakehouse environment
Data Transformation & Lakehouse Design: experience in designing and implementing Medallion Architecture (bronze, silver and gold) using Delta Lake
Data Quality, Test and Observability: experience implementing data quality checks with tools and frameworks such as DLT expectations, Great Expectations or similar, including pipeline testing and monitoring
Data Governance & Lineage: hands-on experience with data cataloguing, lineage and metadata management within Unity Catalog to support governance, auditing and troubleshooting
Performance Optimization: experience tuning Spark and Databricks workloads, including partitioning strategies, file sizing, query optimization and efficient use of Delta Lake features
Production Engineering Practices: experience working with code versioning (Git), peer review and promoting pipelines through development, test and production environments
Security & Access Control Awareness: Understanding of data access control, sensitive data handling and working with Unity Catalog in the context of governed environments
Stakeholder & Team Collaboration: strong communication and analytical skills working with business and technical stakeholders to gather requirements, explain data concepts and support downstream users such as analysts and dashboard developers
Nice to have:
Experience with Amazon Web Services (AWS)
Understanding of DevOps best-practices and solutions such as: Infrastructure-as-code (Terraform)
Databricks Asset Bundles
CI/CD pipelines (Jenkins)
Familiarity with data warehousing and dimensional modelling methodologies (e.g. Kimball, facts & dimensions, star schemas, data marts)
Basic understanding of AI & ML, including preparation of structured and unstructured data for ML use cases and AI agents