This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Contribute to all phases of the software development life cycle; Play a crucial role in designing, developing, and maintaining large scale Data Platform and data pipelines based on microservices architecture; Encourage and optimize the daily execution of technical excellence across an empowered team; Enhance and implement batch and real-time data solutions already in progress; Mentor other team members; Deliver both business and technical objectives through ambiguity and uncertainty; Shape the future of our data ecosystem; Work with a passionate team and modern technologies to drive innovation that impacts the entire organization
Job Responsibility:
Lead and guide the design and implementation of scalable distributed systems based on Java microservices
Engineer and optimize data pipelines using solutions like Apache Hudi, Apache Trino, Azure ADLS
Collaborate cross-functionally with product, analytics, and AI teams to ensure data is a strategic asset
Advance ongoing modernization efforts, deepening adoption of event-driven architectures and cloud-native technologies
Drive adoption of best practices in data governance, observability, and performance tuning for data workloads
Embed data quality in processing pipelines by defining schema contracts, implementing transformation tests and data assertions, enforcing backward-compatible schema evolution, and automating checks for freshness, completeness, and accuracy across batch and streaming paths before production deployment
Establish robust observability for data pipelines by implementing metrics, logging, and distributed tracing for streaming jobs, defining SLAs and SLOs for latency and throughput, and integrating alerting and dashboards to enable proactive monitoring and rapid incident response
Foster a culture of quality through peer reviews, providing constructive feedback and seeking input on your own work
Requirements:
Principal Software Data Engineer with at least 10 years of professional experience in software or data engineering
Minimum of 4 years focused on data pipelines (batch and streaming)
Proven experience driving technical direction and mentoring engineers while delivering complex, high-scale solutions as a hands-on contributor
Strong understanding of event-driven architectures and distributed systems, with hands-on experience implementing resilient, low-latency pipelines
Practical experience with cloud platforms (AWS, Azure, or GCP) and containerized deployments for data workloads
Fluency in data quality practices and CI/CD integration, including schema management, automated testing, and validation frameworks (e.g., dbt, Great Expectations)
Operational excellence in observability, with experience implementing metrics, logging, tracing, and alerting for data pipelines using modern tools
Solid foundation in data governance and performance optimization, ensuring reliability and scalability across batch and streaming environments
Proven experience with Lakehouse architectures and related technologies, including Apache Hudi, Azure ADLS Gen2, HDFS, and other big data technologies (Trino, Databricks, Spark)
Strong collaboration and communication skills, with the ability to influence stakeholders and evangelize modern data practices within your team and organization