This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a highly experienced Senior Data Engineer with strong hands‑on expertise in PySpark, Python, Hadoop, ETL, RDBMS, and Unix as primary skills. The role also requires exposure to GCP Vertex AI and Agentic AI as secondary skills, enabling the engineer to contribute to advanced analytics and AI‑driven data solutions on cloud platforms. This position involves designing, building, and optimizing large‑scale data pipelines while collaborating with data scientists and AI teams to support intelligent, automated systems.
Job Responsibility:
Design, develop, and maintain large‑scale batch and streaming data pipelines using PySpark and Hadoop
Write high‑performance Python and PySpark code for data ingestion, processing, and transformation
Build and optimize ETL frameworks for reliable and scalable data movement
Work with RDBMS systems for data modeling, query optimization, and integration
Perform data validation, reconciliation, and quality checks
Troubleshoot performance bottlenecks in distributed data processing jobs
Use Unix/Linux environments for scripting, scheduling, monitoring, and troubleshooting
Collaborate with data architects and downstream analytics teams
Requirements:
7–10 years of experience in Data Engineering or Big Data roles
Strong hands‑on experience with PySpark, Python, and Hadoop
Proven expertise in ETL design and enterprise data pipelines
Solid understanding of RDBMS concepts and SQL optimization
Experience working in Unix/Linux environments
Exposure to GCP Vertex AI or AI/ML platforms is a strong plus
Understanding of Agentic AI concepts is desirable
Strong problem‑solving and communication skills
Nice to have:
Experience with cloud data platforms (GCP preferred)