This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a highly skilled and motivated Big Data Engineer to join our dynamic team. The ideal candidate will have extensive experience in designing, developing, and optimizing scalable data solutions using the Hadoop ecosystem, with a strong focus on PySpark and Hive. This role is crucial for building robust ETL pipelines, ensuring data quality, and driving performance improvements across our Big Data initiatives.
Job Responsibility:
Design, develop, and maintain efficient and scalable Big Data solutions using PySpark, Apache Hive, and Hadoop ecosystem tools (e.g., Sqoop)
Implement and optimize ETL (Extract, Transform, Load) processes and data warehousing solutions, including Fact, Dimension, and Slowly Changing Dimensions (SCD-2)
Conduct in-depth data analysis, troubleshoot complex data issues, and ensure the accuracy, reliability, and integrity of data
Optimize Big Data workflows, including Spark job tuning and Hive query optimization, leveraging partitioning strategies and indexing techniques in distributed storage systems
Perform rigorous unit testing and validation of data pipelines and transformations
Collaborate with data scientists, analysts, and other engineers to understand data requirements and deliver robust data solutions
Requirements:
Demonstrated proficiency with Apache Hadoop, Apache Hive, and PySpark for data processing and analysis
Strong understanding and practical experience with data warehousing concepts, dimensional modeling, and SCD-2 implementation
Proven experience in designing and developing ETL pipelines
familiarity with various ETL tools is an advantage
Advanced SQL knowledge, including complex joins, subqueries, and performance tuning of SQL queries
Proficient in shell scripting for automation of batch processes
Experience with CI/CD tools such as Bitbucket and Jenkins
Familiarity with business intelligence (BI) reporting tools like Tableau
Excellent critical thinking and problem-solving skills with a strong analytical mindset
Ability to work independently and collaboratively in a fast-paced environment
Strong communication skills to articulate technical concepts and solutions effectively
Bachelor’s degree/University degree or equivalent experience
Nice to have:
Experience and/or certifications with major cloud platforms and their Big Data services (e.g., AWS, Azure Databricks, Google Cloud)
Advanced knowledge of Unix shell scripting for system administration and automation