This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Member of Technical Staff in the pretraining evals team, you will play a key role in helping us make modelling decisions based on experimental outcomes for our large language models (LLMs). Your primary focus will be on developing better ways to measure base model progress. This can include implementing new/better evaluations for base model capabilities, finding ways to reduce noise in our current model evaluations, or developing evaluation benchmarks that measure model progress at all model scales, among other directions.
Job Responsibility:
Deeply understand each individual evaluation task in our base model evaluation suite, have a clear idea of what each task measures and know their strengths and limitations
Suggest and implement improvements to our base model evaluation suite, whether by adding new tasks to measure unmeasured model capabilities or removing redundant or low-signal tasks
Improve the statistical understanding of our evals and improve the signal-to-noise ratio of our evaluation suite
Requirements:
Familiarity with base model evaluations and how they differ from post-trained models
Strong statistical skills and experience evaluating scientific experiments related to data collection and model performance
Ability to convey statistical information effectively to a broad audience using visualizations and easy-to-understand numbers
Extremely strong software engineering skills
Proficiency in programming languages such as Python and ML frameworks (e.g., PyTorch, TensorFlow, JAX)
Excellent communication skills to collaborate effectively with cross-functional teams and present findings
One or more papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP)
What we offer:
An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in-office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top-up for up to 6 months
Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend