This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Meta is seeking Research Scientists to join the Evaluations team within Meta Superintelligence Labs (MSL). Evaluations are the core of AI progress at MSL, determining what capabilities get built, which features get prioritized, and how fast our models improve. As a Research Scientist, you will provide the technical capabilities to measure and understand the capabilities of our frontier AI systems. You'll work in tandem with world-class researchers to envision, develop, and validate novel evaluations that shape the future of AI capability measurement. This is a highly technical research role requiring sound scientific judgment, creativity, and the ability to drive ambitious research agendas with independence. The evaluations you develop will directly influence research direction and major model lines within MSL, making scientific validity, methodological rigor, and clear communication important. You will collaborate closely with technical leadership to ensure evaluations capture the most important capabilities, translating organizational priorities into measurable benchmarks, and translating evaluation insights back into research direction. We are looking for exceptional research talent – researchers who have shaped the field of machine learning, and are ready to do so again at the frontier of AI. If you are passionate about defining how we measure AI progress and want to shape the scientific foundations of frontier AI development, we encourage you to apply for this exciting opportunity at the core of MSL.
Job Responsibility:
Design novel benchmarks and evaluation methodologies for frontier AI capabilities
Contribute to evaluation frameworks that guide research direction and capability development across MSL
Support the scientific vision for evaluation approaches in emerging modalities and novel model capabilities
Partner with cross-functional research teams across product and model training to identify and prioritize gaps in capability through rigorous evaluation
Work on research workstreams that shape the long-term direction of evaluation science at MSL, working independently while also contributing to team goals and organizational priorities
Requirements:
Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
Ph.D. in Computer Science, Machine Learning, or a related technical field
3+ years of experience in machine learning research, with a focus on evaluation, deep learning, or related areas
Demonstrated ability to execute on technical research projects from conception to production
Effective communication skills and experience collaborating with technical leadership
Nice to have:
Multiple first-author publications at top-tier peer-reviewed venues (NeurIPS, ICML, ICLR, ACL, EMNLP, or similar) related to language model evaluation, benchmarking, or deep learning
Recognized expertise in machine learning evaluation, benchmarking, or capability measurement
Track record of research that has substantially influenced the field of deep learning
Hands-on experience with language model post-training, RLHF, or related techniques