This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Reporting to the Manager, Data Engineering, as a Senior Data Engineer, you will be building tools and infrastructure to support the efforts of the Data Products and Insights & Innovation teams, and the business as a whole. You will collaborate with all levels of the Data and AI team as well as various engineering teams to develop data solutions, scale our data infrastructure, and advance Wave to the next stage in our transformation as a data-centric organization.
Job Responsibility:
Design, build, and deploy components of a modern data platform, including CDC-based ingestion using Debezium and Kafka, a centralized Hudi-based data lake, and a mix of batch, incremental, and streaming data pipelines
Maintain and enhance the existing Amazon Redshift data warehouse and legacy Python ELT pipelines
Accelerate the transition to a brand-new Databricks-based analytics and processing environment integrated with dbt
Build fault-tolerant, scalable, and cost-efficient data systems
Continuously improve observability, performance, and reliability across both legacy and modern platforms
Work closely with cross-functional partners to plan and roll out data infrastructure and processing pipelines that support analytics, machine learning, and GenAI use cases
Respond to PagerDuty alerts, troubleshoot incidents, and proactively implement monitoring and alerting
Assess existing systems, improve data accessibility, and deliver practical solutions that enable internal teams to generate actionable insights
Requirements:
6+ years of experience in building data pipelines and managing a secure, modern data stack
Experience with CDC streaming ingestion using tools like Debezium into a data warehouse that supports AI/ML workloads
At least 3 years of experience working with AWS cloud infrastructure, including Kafka (MSK), Spark / AWS Glue, and infrastructure as code (IaC) using Terraform
Fluency in SQL, strong understanding of data modelling principles and data storage structures for both OLTP and OLAP
Experience developing or maintaining a production data system on Databricks
Write and review high-quality, maintainable code using Python, SQL, and dbt
Prior experience building data lakes on S3 using Apache Hudi with Parquet, Avro, JSON, and CSV file formats
Experience developing and deploying data pipeline solutions using CI/CD best practices
Strong communication skills
Self-motivated and comfortable working autonomously
Nice to have:
Familiarity with data governance practices, including data quality, lineage, and privacy, as well as experience using cataloging tools
Working knowledge of tools such as Stitch and Segment CDP for integrating diverse data sources
Knowledge and practical experience with Looker, Power BI, Athena, Redshift, or Sagemaker Feature Store to support analytical and machine learning workflows
What we offer:
Bonus Structure
Employer-paid Benefits Plan
Health & Wellness Flex Account
Professional Development Account
Wellness Days
Holiday Shutdown
Wave Days (extra vacation days in the summer)
Get A-Wave Program (work from anywhere in the world up to 90 days)