This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Research Engineer, Vision-Language Action (VLA) Models, you will train the NEO robot to perform long‑horizon autonomous tasks in unstructured environments. You will take ownership of autonomous capability development across the full lifecycle: from data review and model design, through deployment and ongoing fleet performance monitoring. Your work will involve combining vision, control, and learning to enable whole‑body manipulation and navigation in novel settings.
Job Responsibility:
Take extreme ownership over autonomous capabilities: reviewing data, designing model architectures, shipping models, and maintaining performance across the fleet
Train NEO for whole‑body manipulation and navigation tasks in unseen environments
Design robust evaluation metrics to support scaling of model pre‑training
Experiment with state‑of‑the‑art techniques from vision–language models and generative model literature to predict actions
Collaborate with controls, QA, and data collection teams to deploy reinforcement learning policies to the production fleet
Requirements:
Strong programming experience in Python (and familiarity with tools like Bazel)
Experience with frameworks like PyTorch
Experience with simulation environments (e.g., Isaac Sim, MuJoCo)
Deep understanding of how autonomous systems generalize to new environments
Experience designing evaluation metrics and validating models in real or simulated settings
Ability to coordinate with cross‑functional teams (controls, QA, data) to bring models into production