This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for an experienced Data Lake / ETL Engineer with 7+ years of expertise in designing, developing, and managing large-scale data ingestion, transformation, and analytics pipelines. The role involves building scalable and secure data lake platforms, enabling business insights through efficient ETL/ELT frameworks, and ensuring data quality, performance, and governance across the enterprise ecosystem.
Job Responsibility:
Design and implement data ingestion pipelines for structured, semi-structured, and unstructured data
Develop and manage ETL/ELT processes for large-scale data processing
Optimize storage and retrieval strategies across on-prem and cloud-based data lakes
Integrate data from multiple sources (databases, APIs, streaming platforms)
Implement real-time and batch processing using Apache Spark, Kafka, or Flink
Support metadata management, data lineage, and cataloging
Tune queries and pipelines for high performance and cost efficiency
Implement partitioning, indexing, and caching strategies for large datasets
Automate routine ETL/ELT workflows for reliability and speed
Ensure compliance with data governance, privacy, and regulatory standards (GDPR, HIPAA, etc.)
Implement encryption, masking, and role-based access control (RBAC)
Collaborate with cybersecurity teams to align with Zero Trust and IAM policies
Partner with data scientists, analysts, and application teams for analytics enablement
Provide L2/L3 support for production pipelines and troubleshoot failures
Mentor junior engineers and contribute to best practices documentation
Requirements:
7+ years of experience in data engineering, ETL/ELT development, or data lake management
Strong expertise in ETL tools (Informatica, Talend, dbt, SSIS, or similar)
Hands-on experience with big data ecosystems: Hadoop, Spark, Hive, Presto, Delta Lake, or Iceberg
Proficiency with SQL, Python, or Scala for data processing and transformation
Experience with cloud data platforms (AWS Glue, Redshift, Azure Synapse, GCP BigQuery)
Familiarity with workflow orchestration tools (Airflow, Temporal, Oozie)
Nice to have:
Exposure to real-time data streaming (Kafka, Kinesis, Pulsar)
Knowledge of data modeling (Kimball/Inmon), star schema, and dimensional modeling
Experience with containerized deployments (Docker, Kubernetes)
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.