This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The role is responsible for designing, building, maintaining, analyzing, and interpreting data to provide actionable insights that drive business decisions. This role involves working with large datasets, developing reports, supporting and executing data governance initiatives and visualizing data to ensure data is accessible, reliable, and efficiently managed. The ideal candidate has strong technical skills, experience with big data technologies, and a deep understanding of data architecture and ETL processes
Job Responsibility:
Design, develop, and maintain data solutions for data generation, collection, and processing
Be a key team member that assists in design and development of the data pipeline
Create data pipelines and ensure data quality by implementing ETL processes to migrate and deploy data across systems
Contribute to the design, development, and implementation of data pipelines, ETL/ELT processes, and data integration solutions
Take ownership of data pipeline projects from inception to deployment, manage scope, timelines, and risks
Collaborate with cross-functional teams to understand data requirements and design solutions that meet business needs
Develop and maintain data models, data dictionaries, and other documentation to ensure data accuracy and consistency
Implement data security and privacy measures to protect sensitive data
Leverage cloud platforms (AWS preferred) to build scalable and efficient data solutions
Collaborate and communicate effectively with product teams
Collaborate with Data Architects, Business SMEs, and Data Scientists to design and develop end-to-end data pipelines to meet fast-paced business needs across geographic regions
Identify and resolve complex data-related challenges
Adhere to best practices for coding, testing, and designing reusable code/component
Explore new tools and technologies that will help to improve ETL platform performance
Participate in sprint planning meetings and provide estimations on technical implementation
Design and develop data pipelines leveraging Databricks, PySpark, and SQL to ingest, transform, and process large-scale datasets
Engineer solutions for both structured and unstructured data to enable advanced analytics and insights
Implement automated workflows for data ingestion, transformation, and deployment using Databricks Jobs and notebooks, with ongoing monitoring and scheduling
Apply performance optimization techniques, including Spark job tuning, caching, partitioning, and indexing, to improve scalability and efficiency
Build integrations with multiple data sources, such as SQL databases, APIs, and cloud storage platforms, ensuring seamless connectivity and reliability
Collaborate effectively with global teams across time zones to maintain alignment, resolve issues, and deliver on shared objectives
Requirements:
Bachelor’s / Master’s degree and 5 to 8 years of Computer Science, IT or related field experience
Hands-on experience with big data technologies and platforms, such as Databricks, Apache Spark (PySpark, SparkSQL), workflow orchestration, performance tuning on big data processing
Proficiency in data analysis tools (e.g. SQL) and experience with data visualization tools
Excellent problem-solving skills and the ability to work with large, complex datasets
Strong understanding of data governance frameworks, tools, and best practices
Nice to have:
Knowledge of data protection regulations and compliance requirements (e.g., GDPR, CCPA)
Experience with ETL tools such as Apache Spark, and various Python packages related to data processing, machine learning model development
Strong understanding of data modeling, data warehousing, and data integration concepts
Knowledge of Python/R, Databricks, SageMaker, cloud data platforms
Experience implementing automated orchestration and monitoring of data pipelines using Databricks Jobs, Apache Airflow, or similar workflow tools
Familiarity with performance optimization techniques for big data processing, such as Spark job tuning, caching, partitioning, and indexing
Exposure to multi-source integration involving APIs, SQL databases, and cloud storage platforms
Demonstrated ability to collaborate across global teams and time zones, ensuring alignment and delivery in distributed environments
Professional Certifications (Preferred): Certified Data Engineer / Data Analyst (preferred on Databricks or cloud environments)
Soft Skills: Excellent critical-thinking and problem-solving skills
Strong communication and collaboration skills
Demonstrated awareness of how to function in a team setting