This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
At Catawiki, data sits at the core of our decision-making, powering everything from commercial strategy and analytics to machine learning, AI, and performance marketing. The Data Engineering role exists to ensure this data foundation is robust, scalable, and ready to support a fast-growing global marketplace. You’ll join a highly collaborative engineering environment, working closely with Machine Learning Engineers, Platform Engineers, and Backend Engineers. The team is responsible for building and evolving the data ecosystem that enables teams across Catawiki to explore, experiment, and innovate with confidence. The scope of the role is intentionally broad, sitting at the intersection of data engineering, data platform engineering, and machine learning enablement.
Job Responsibility:
Build and Scale Data Pipelines: Maintain and develop reliable batch and streaming pipelines that ingest data from internal systems and third-party sources into Catawiki’s data warehouse
Empower Data Science and AI: Maintain and enhance the tools and platforms used by Data Scientists for analysis, experimentation, model training, and model deployment
Protect Data and Privacy: Ensure data is stored securely and that governance, access control, and privacy standards are consistently applied across the data platform
Run and Evolve the Data Platform: Maintain the infrastructure that hosts our data tools and applications, keeping it scalable, stable, and cost-effective
Own Core Data Tooling: Self-host and operate key data engineering tools such as Airflow and Airbyte on Kubernetes
Keep the Lights On: Provide operational support to ensure pipelines, platforms, and tools run smoothly and reliably for teams across the business
Requirements:
3+ years of hands-on experience building and operating data systems in production
Fluent in Python and SQL
Experience with data integration tools such as Fivetran and/or Airbyte
Experience with CI/CD, Infrastructure as Code (e.g. Terraform), and modern DataOps practices
Experience with cloud platforms (GCP is a plus)
Familiar with parts of our data stack such as BigQuery, PubSub, DataFlow, GKE, Airflow, Airbyte, FastAPI and Prometheus
Experience with streaming pipelines using technologies like Kafka, Pub/Sub, Dataflow, or Apache Beam
Keen to learn new tools, support data platform and machine learning engineering initiatives
Understand the importance of data privacy and GDPR
What we offer:
€100 Catavoucher upon joining
€50 Catavoucher on each birthday
An extra day off each year to 'Pursue Your Passion'
Additional time off for significant work anniversaries (3, 5, 8, 10 years)
Extra leave for life’s big moments like marriage, engagements, or moving house