This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and the only realtime orchestration platform optimized for thousands of queries per second. We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA.
Requirements:
Inference Optimization. Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM
Model Acceleration. Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding
High-Performance Systems. Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs
Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections
Public work. Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups
Full-cycle ownership. You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production
Background. PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems
Professional fluency in English (written and spoken) is required, as you will be collaborating daily with our US-based leadership and engineering teams