This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Research Scientist focused on Frontier Risk Evaluations, you will design and create evaluation measures, harnesses and datasets for measuring the risks posed by frontier AI systems.
Job Responsibility:
Design and create evaluation measures, harnesses and datasets for measuring risks posed by frontier AI systems
Design and build harnesses to test AI models and systems for dangerous capabilities
Work with government agencies or other labs to collectively scope and design evaluations
Publish evaluation methodologies and write technical reports for policymakers
Requirements:
Commitment to mission of promoting safe, secure, and trustworthy AI deployments
Practical experience conducting technical research collaboratively
Comfort building and instrumenting ML pipelines, writing evaluation harnesses, and turning research ideas into prototypes
Track record of published research in machine learning, particularly generative AI
At least three years of experience addressing sophisticated ML problems
Strong written and verbal communication skills
Nice to have:
Experience in crafting evaluations and benchmarks, or background in data science roles related to LLM technologies
Experience with red-teaming or adversarial testing of AI systems
Familiarity with AI safety policy frameworks (e.g., NIST AI RMF, EU AI Act, Korea AI Basic Act)