This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As our Data Engineer, you will design, build, and maintain the data infrastructure that powers Sensmore’s embodied AI and Vision-Language-Action Models (VLAMs). You’ll collaborate with Robotics, ML and Software engineers to ensure clean, reliable data flows from our sensor arrays (radar, LiDAR, cameras, IMUs) into training and inference pipelines. This role blends classic data engineering (ETL/ELT, warehouse design, monitoring) with ML Ops best practices: model versioning, data drift detection, and automated retraining.
Job Responsibility:
Build & operate data pipelines: Ingest, process, and transform multi-sensor telemetry (radar point-clouds, video frames, log streams) into analytics-ready and ML-ready formats
Design scalable storage: Architect high-throughput, low-latency data lakes and warehouses (e.g., S3, Delta Lake, Redshift/Snowflake)
Enable ML Ops workflows: Integrate DVC or MLflow, automate model training/retraining triggers, track data/model lineage
Ensure data quality: Implement validation, monitoring, and alerting to catch anomalies and schema changes early
Collaborate cross-functionally: Partner with Embedded Systems, Robotics, and Software teams to align on data schemas, APIs, and real-time requirements
Optimize performance: Tune distributed processing, queries, and storage layouts for cost-efficiency and throughput
Document & evangelize: Maintain clear documentation for data schemas, pipeline architectures, and ML Ops practices to uplift the whole team
Requirements:
3+ years of hands-on experience building production data pipelines in the cloud (AWS, GCP, or Azure)
Proficiency in Python, SQL, and at least one big-data framework
Familiarity with ML Ops tooling: DVC, MLflow, Kubeflow, or similar
Experience designing and operating data warehouses/data lakes (e.g., Redshift, Snowflake, BigQuery, Delta Lake)
Strong understanding of distributed systems, data serialization (Parquet, Avro), and batch vs. streaming paradigms
Excellent problem-solving skills and the ability to work in ambiguous, fast-paced environments
Nice to have:
Background in robotics or sensor data (radar, LiDAR, camera pipelines)
Knowledge of real-time data processing and edge-computing constraints
Experience with infrastructure as code (Terraform, CloudFormation) and CI/CD for data workflows
Familiarity with Kubernetes and containerized deployments
Exposure to vision-language or action-planning ML models
What we offer:
Attractive compensation package and stock options
Beverages on-site and regular social events
Engage with top-tier researchers, engineers, and thought leaders
Influence the future of robotic technologies and tackle significant technological challenges