This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a skilled and experienced Lead Data Engineer to join our Data Science team. The team ingests large amounts of complex sensor data (billions of data points a day), combines it with data from other teams, and produces advanced modelling products that help people park their car or charge their electric vehicle. For example, we predict the availability of parking in cities across the world and provide drivers with routes that reduce the time they will spend searching for a space near their destination. These machine learning models are high-quality production services and are updated regularly using fresh data. You will lead the design, development, and enhancement of pipelines to ingest and process streaming data for use in our machine learning models. You will be an important member of our team, lead engineering initiatives and work with smart colleagues in a supportive environment.
Job Responsibility:
Develop pipelines for scalable big data processing with Spark, and real-time data streaming with Kafka
Write pipelines using efficient, testable, and reusable Python code using (for example) Numpy, Pandas and Pyspark
Manage numerous pipelines using Airflow to meet data serving and modelling requirements
Ensure services are reliable, robust, and follow industry best practice in data validation, transformation, and logging
Be hands-on with infrastructure and cloud deployments
Lead initiatives enhancing our processes and infrastructure (e.g., CI/CD pipelines, data monitoring capabilities, feature stores)
Use experience to promote best practices amongst our data scientists and junior engineers
Collaborate and support team members
Requirements:
Proven experience as a software or data engineer in complex production environments
High proficiency in Python, including software development standards and knowledge of the Python data science / engineering ecosystem (e.g. Numpy, Pandas)
Strong command of Linux, containers (Docker), and infrastructure as code for cloud deployments (AWS preferred)
Comfortable leading initiatives and mentoring others
Experience with: Large-scale data processing in the cloud (we use AWS)
Distributed processing frameworks, such as Apache Spark
Nice to have:
Desirable, experience with: Workflow management tools, such as Apache Airflow
Streaming data processing, such as Apache Kafka
Data or ML platforms, such as Snowflake or Databricks
What we offer:
Flexible working - hybrid home and office-based opportunities
Paid Leave if you participate in an event for Charity
25 Days holiday entitlement
An enhanced Workplace Pension Scheme - 5% by Arrive, 3% by you
Private Medical Health Insurance
Fantastic wellbeing programmes, including On-site Sports massages, Reiki and Head massages every week
Discounted gym membership
Access to Blue Call, a mental health support platform