Job Description
Amazon Devices is an inventive research and development company that designs and engineers high-profile devices like the Kindle family of products, Fire Tablets, Fire TV, Health & Wellness, Amazon Echo, and Astro products. This is an exciting opportunity to bring generative AI to Amazon's consumer products, both on-device at the edge and in the cloud. Our compression platform delivers 20x to 100x neural network compression, but using it well still takes weeks of hands-on learning and expert intuition. The Edge AI Model Studio team exists to change that. We become the expert users so partner teams don't have to: we turn compression science into reliable, production workflows, and we package the results into a library of compression-ready student architectures that partners can run on their own. Our north star is simple. Training-to-deployment should feel like pushing a button, not a month-long science project. We are looking for an Applied Scientist to join Model Studio and help compress the next generation of models for edge and cloud deployment across modalities, including large language models, vision-language models, speech and audio models, and omni models that reason jointly over text, audio, and video. You will apply and extend state-of-the-art compression recipes to real models, define the benchmarks and evaluation methodology that make trade-offs explicit, and build the reference implementations that let other teams deploy compressed models without our help. You will work backwards from deployment constraints such as memory, latency, throughput, power, and cost, which differ across edge and cloud targets, partnering closely with fellow scientists, platform and compiler engineers, hardware architects, and product teams. The role sits on two frontiers at once. Compressing a model effectively and healing it back to quality means staying current not just with the latest compression techniques, but with the rapidly evolving model architectures themselves, and understanding deeply how each one works inside. You will take ownership of project-level delivery, apply advanced compression across a wide range of real models, and have room to grow your scope and technical influence.