This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Applied AI at Uber builds intelligent systems that power next-generation product experiences for riders, drivers, merchants, and couriers. As a Staff Voice AI Engineer, you will lead the design and deployment of large-scale, real-time Voice AI systems that enable natural, reliable, and intelligent voice interactions across Uber’s ecosystem. You will operate as a full-stack technical leader across speech modeling, LLM-powered conversational intelligence, and low-latency backend infrastructure — owning Voice AI systems end-to-end, from model development and evaluation to highly available, distributed production services. This includes advancing capabilities in automatic speech recognition (ASR), text-to-speech (TTS), spoken language understanding, and LLM-driven dialogue systems. You will partner closely with product, design, and infrastructure teams to translate customer pain points into seamless voice-first experiences — setting the foundation for how Voice AI is built, deployed, and operated across Uber’s global platform.
Job Responsibility:
Design and build end-to-end Voice AI solutions, from understanding customer pain points and defining product requirements to deploying LLM-powered, real-time voice interfaces in production
Benchmark and evaluate voice AI systems, including speech recognition, speech synthesis, and spoken language understanding, by designing evaluations, analyzing results, and identifying systematic weaknesses
Improve voice model performance through system prompt tuning, fine-tuning voice- and speech-specific models, and optimizing architectures for low-latency, real-time voice interactions
Analyze voice request logs, prompt traces, and audio inputs to diagnose failure modes, improve transcription accuracy, conversational quality, and overall user experience
Build and maintain internal tools and platforms to automate Voice AI workflows, such as large-scale transcription pipelines, real-time audio processing services, and evaluation harnesses for voice quality
Own Voice AI systems in production end-to-end, including rollout strategies, monitoring, alerting, quality regression detection, and on-call readiness
Collaborate closely with product, design, and research teams to translate user needs into Voice AI capabilities with measurable business and customer impact
Requirements:
10+ years of experience in software engineering, data science, or machine learning, including a track record of shipping production AI systems
Deep understanding of large language models, including fine-tuning, prompt engineering, embeddings, and retrieval-augmented generation (RAG)
Strong backend and distributed systems expertise, with experience designing and operating highly available, scalable services in production
Deep experience with ML infrastructure, including model training pipelines, online serving systems, feature stores, experiment platforms, and evaluation frameworks
Hands-on experience with distributed data processing systems (e.g., Spark, Flink, Ray) and workflow orchestration (e.g., Airflow or equivalent)
Ability to analyze data, run experiments, and derive insights for model and product improvement
Excellent communication and collaboration skills across technical and non-technical teams
Nice to have:
Experience building evaluation frameworks for Voice AI, including metrics and human/LLM-assisted evaluations for speech recognition accuracy, latency, robustness, and naturalness of synthesized speech
Demonstrated expertise in machine learning fundamentals applied to voice, including model evaluation, training, and fine-tuning of ASR, TTS, or speech-language models
Proven experience deploying Voice AI systems to production, with an emphasis on low-latency, high-reliability, real-time environments
Experience writing developer documentation, creating voice-specific SDKs, or enabling internal teams to build on shared Voice AI platforms
Hands-on work with large-scale audio datasets, including data curation, labeling strategies, and optimization of voice processing pipelines at scale
What we offer:
Eligible to participate in Uber's bonus program
May be offered an equity award & other types of comp