This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We train models on petabyte-scale automotive sensor data, but training is only half the story. Before a single GPU cycle is spent, engineers need to find, filter, evaluate, and understand the data. We build the analytics and search infrastructure that makes petabytes of measurements and recordings queryable in seconds, enabling rapid dataset assembly, quality analysis, and model evaluation at scale.
Job Responsibility:
Design and build high-performance search and query pipelines over PB-scale MDF4 and MCAP data lakes, enabling ML engineers to find relevant driving scenarios, sensor conditions, and edge cases across billions of records in seconds
Build and operate indexing and cataloguing systems for automotive sensor data, including metadata extraction, signal-level indexing, scene tagging, and embedding-based similarity search
Implement distributed compute pipelines for large-scale data evaluation, such as batch statistics, distribution analysis, annotation coverage reports, and data-quality scoring
Build fast analytical queries that enable interactive exploration on top of raw data
Develop dataset assembly pipelines that automatically assemble, version, and register training and evaluation datasets
Optimise for cost and performance through intelligent partitioning, tiered storage, caching strategies, and query pushdown to minimise scan volumes over PB-scale data
Operate observability stacks for data pipelines, including query latency dashboards, pipeline health, and data freshness monitors
Requirements:
University degree in Computer Science, Engineering, or a related field
3–5 years of experience in big data or data engineering with a focus on analytics and search over very large datasets
Strong Python and SQL skills, with experience in at least one distributed compute framework
Experience with columnar or analytical storage and query optimisation at PB scale
Familiarity with search and indexing technologies, including full-text search, vector/embedding search or metadata catalogues
Production experience with Kubernetes and AWS / Azure / Google Cloud, as well as hands-on experience with infrastructure-as-code
Experience with automotive measurement data (MDF4/ASAM MDF or MCAP) as well as with embedding-based retrieval, dataset management tools, stream processing, or graph-based metadata systems
What we offer:
Challenging projects with which we shape the mobility of tomorrow together
Wide range of personal and professional development opportunities
Attractive, fair and performance-related remuneration
High level of job security
Annual special payments such as vacation pay, Christmas bonus, and profit sharing
Flexible working hours including six weeks annual leave and overtime compensation