This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Vision understanding is a critical addition to conversational AI, bridging the gap between speech and the physical world. We’re looking for a skilled engineer or researcher to build high-value synthetic data pipelines that accelerate vision model development. The ideal candidate will be fluent in classical computer vision techniques while also comfortable leveraging modern machine learning tools across the stack: from neural rendering and diffusion-based image synthesis to transfer learning, domain adaptation, and data-centric evaluation. You’ll collaborate with research, hardware, and product teams to build capture, generation, and rendering systems that combine physical accuracy with visual realism—delivering datasets and simulators that measurably improve downstream computer vision tasks.
Job Responsibility:
Build and maintain synthetic data generation pipelines (e.g., neural rendering, diffusion/score-based models, controllable generative priors, procedural assets) with levers for pose, expression, illumination, materials, and sensor characteristics
Apply transfer learning and domain adaptation (self-supervised pretraining, style/appearance transfer, sim-to-real) to bridge distribution gaps between synthetic and real data
Integrate off-the-shelf and open-source components where practical
fine-tune or distill models to meet latency, memory, and quality targets on target hardware
Stand up end-to-end systems—from capture and calibration to generation, data curation, quality gates, rendering/evaluation suites, and deployment
Define dataset and model evaluation frameworks (coverage, bias, sim-to-real gap, task-level KPIs such as gaze error) and iterate based on quantitative results
Survey literature across graphics, vision, and generative ML
prototype, adapt, and, where needed, invent new approaches that push facial reconstruction, appearance modeling, and synthetic data quality forward
Requirements:
Demonstrated experience with 3D reconstruction, photorealistic rendering, appearance modeling, or synthetic data generation for vision tasks
Ability to navigate and deliver results in high-ambiguity, open-ended problem spaces
Familiarity with large-scale, multi-camera datasets and the practicalities of curation, annotation, and evaluation
Excellent communication skills and the ability to work collaboratively across disciplines
Bachelor’s degree or higher in computer graphics, vision, imaging, machine learning, or a related field
Nice to have:
Master’s or Ph.D. in a relevant discipline
Hands-on experience training or adapting neural rendering models (e.g., NeRF/3DGS variants, relighting, inverse rendering) and modern generative models (e.g., diffusion/latent diffusion, controllable text-to-image/video, inpainting/outpainting)
Proficiency in PyTorch, JAX, or other modern ML frameworks
What we offer:
401k matching
100% employer-paid health, vision, and dental benefits