This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a highly capable Senior Researcher to push the boundaries of streaming speech recognition and speech understanding. This role sits at the intersection of cutting-edge research and production impact—you'll develop novel architectures and algorithms that ship to millions of users. This is a unique opportunity to shape the next generation of Speech AI at a company experiencing rapid growth in one of the most dynamic fields in AI. You'll join the team developing Universal-Streaming—our production streaming ASR system—while exploring the frontier of LLM-based contextualization and real-time speech understanding.
Job Responsibility:
Design and develop novel streaming ASR architectures, pushing the boundaries of accuracy-latency tradeoffs in production systems
Research and prototype LLM-assisted speech-to-text —exploring how large language models can enhance streaming speech recognition and understanding
Develop new algorithms for streaming speaker diarization, contextual biasing, and multilingual speech recognition
Drive research from initial experimentation through rigorous evaluation to production deployment, working closely with engineering teams
Conduct systematic experiments on internal and public benchmarks, with careful attention to evaluation methodology and statistical rigor
Contribute to technical publications and represent AssemblyAI's research at top venues
Collaborate on research direction and technical strategy, helping shape the roadmap for Speech AI capabilities
Requirements:
Strong research background in speech recognition, with deep understanding of classic streaming architectures (RNN-T, CTC) and modern attention mechanisms
Expertise in LLMs and language modeling, with ability to bridge speech and language model research
Proficiency in PyTorch and JAX/Flax—you can move fluidly between frameworks and implement complex architectures from scratch
Experience with large-scale distributed training and the practical challenges of scaling speech models
Track record of publications at top venues (ICASSP, Interspeech, NeurIPS, ICML) or equivalent industry impact
Strong experimental methodology—systematic approach to ablations, careful baseline comparisons, and rigorous evaluation
Ability to translate research insights into production-ready solutions
comfort working at the interface of research and engineering
Excellent communication skills—can articulate complex technical ideas clearly and collaborate effectively across teams
What we offer:
competitive equity grants
100% employer-paid benefits
flexibility of being fully remote
401k match up to 4% for US-based full time team members