This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Multimodal Speech Engineer on the AI Companion Team, you will lead the development of a real-time conversational speech model that integrates multiple modalities including vision, spatial audio, and body language. You will collaborate with cross-functional teams to align NEO’s speech with its physical embodiment and personality. This is a key role in shaping how users interact with our humanoid robot in intuitive, engaging ways.
Job Responsibility:
Design and implement data pipelines for large-scale speech interactions using internal and external datasets
Train speech-to-speech models that incorporate awareness of NEO’s physical form
Create dynamic responses for a wide range of user queries
Synchronize NEO’s speech with physical gestures and body language
Customize NEO’s speech behavior to reflect different personalities
Requirements:
3+ years of experience in speech and audio modeling domains
Experience with multi-modal conversational models (language, audio, vision)
Ability to take open-ended problems in conversation modeling, develop creative solutions, build proof-of-concepts, and scale them to production
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.