This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Senior Python Lead and Developer will play a crucial role in delivering data engineering initiatives. The ideal candidate will have 8-10 years of experience in building data pipelines using Python and PySpark, with expertise in Airflow and CI/CD processes. A bachelor’s degree in Computer Science or Data Analytics is required. This position involves mentoring junior engineers and collaborating with cross-functional teams to ensure high-quality solutions.
Job Responsibility:
Design, develop, and optimize scalable data pipelines using Python and PySpark for batch and streaming workloads
Build, schedule, and monitor complex workflows using Airflow, ensuring reliability and maintainability
Architect and implement CI/CD pipelines for data engineering projects using GitHub, Docker, and cloud-native solutions
Apply test-driven development (TDD) practices and automate unit/integration tests for data pipelines
Implement secure coding best practices and design patterns throughout the development lifecycle
Work closely with Data Architects, QA teams, and business stakeholders to translate requirements into technical solutions
Create and maintain technical documentation, including process/data flow diagrams and system design artifacts
Lead and mentor junior engineers, providing guidance on coding, testing, and deployment best practices
Analyze and resolve technical issues across the data stack, including pipeline failures and performance bottlenecks
Requirements:
Minimum 8-10 years of practical experience
8+ years of experience building production-grade data pipelines using Python and PySpark
Proven track record of designing, deploying, and managing Airflow DAGs in enterprise environments
Ability to build and maintain CI/CD pipelines for data engineering workflows
Experience with containerization (Docker) and cloud platforms (GCP) for data engineering workloads
Ability to write object-oriented Python code, manage dependencies and follow industry best practices
Proficiency with Git for source code management and collaboration
5+ years of experience with Unix/Linux: Strong command-line skills
5+ years of experience SQL: Solid understanding of SQL for data ingestion and analysis
Comfortable with code reviews, pair programming and using remote collaboration tools effectively
Writes code with an eye for maintainability and testability
Bachelor’s or graduate degree in Computer Science, Data Analytics or related field, or equivalent work experience
Minimum of 10+ years overall IT experience
Experienced in waterfall, iterative, and agile methodologies
Nice to have:
A high tolerance for OpenShift, Cloudera, Tableau, Confluence, Jira, and other enterprise tools