This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Amazon Quick Suite is an enterprise AI platform that transforms how organizations work with their data and knowledge. Combining generative AI-powered search, deep research capabilities, intelligent agents and automations, and comprehensive business intelligence, Quick Suite serves tens of thousands of users. Our platform processes thousands of queries monthly, helping teams make faster, data-driven decisions while maintaining enterprise-grade security and governance. From natural language interactions with complex datasets to automated workflows and custom AI agents, Quick Suite is redefining workplace productivity at unprecedented scale. We are seeking a Data Scientist II to join our Quick Data team, focusing on evaluation and benchmarking data development for Quick Suite features, with particular emphasis on Research and other generative AI capabilities. Our mission is to engineer high-quality datasets that are essential to the success of Amazon Quick Suite. From human evaluations and Responsible AI safeguards to Retrieval-Augmented Generation and beyond, our work ensures that Generative AI is enterprise-ready, safe, and effective for users at scale. As part of our diverse team—including data scientists, engineers, language engineers, linguists, and program managers—you will collaborate closely with science, engineering, and product teams. We are driven by customer obsession and a commitment to excellence.
Job Responsibility:
Design and develop comprehensive evaluation and benchmarking datasets for Quick Suite AI-powered features
Leverage LLMs for synthetic data corpora generation
data evaluation and quality assessment using LLM-as-a-judge settings
Create ground truth datasets with high-quality question-answer pairs across diverse domains and use cases
Lead human annotation initiatives and model evaluation audits to ensure data quality and relevance
Develop and refine annotation guidelines and quality frameworks for evaluation tasks
Conduct statistical analysis to measure model performance, identify failure patterns, and guide improvement strategies
Collaborate with ML scientists and engineers to translate evaluation insights into actionable product improvements
Build scalable data pipelines and tools to support continuous evaluation and benchmarking efforts
Contribute to Responsible AI initiatives by developing safety and fairness evaluation datasets
Requirements:
2+ years of data scientist experience
3+ years of data querying languages (e.g. SQL), scripting languages (e.g. Python) or statistical/mathematical software (e.g. R, SAS, Matlab, etc.) experience
3+ years of machine learning/statistical modeling data analysis tools and techniques, and parameters that affect their performance experience
1+ years of working with or evaluating AI systems experience
1+ years of creating or contributing to mathematical textbooks, research papers, or educational content experience
Master's degree in Science, Technology, Engineering, or Mathematics (STEM), or experience working in Science, Technology, Engineering, or Mathematics (STEM)
Experience applying theoretical models in an applied environment
Nice to have:
Ph.D. in Science, Technology, Engineering, or Mathematics (STEM)
Knowledge of machine learning concepts and their application to reasoning and problem-solving
Experience in a ML or data scientist role with a large technology company
Experience in defining and creating benchmarks for assessing GenAI model performance
Experience working on multi-team, cross-disciplinary projects
Experience applying quantitative analysis to solve business problems and making data-driven business decisions
Experience effectively communicating complex concepts through written and verbal communication
What we offer:
health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)