This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Member of Technical Staff in Data Analysis and Evaluation, you will play a pivotal role in ensuring the quality, reliability, and performance of our large language models (LLMs). Your primary focus will be on designing and conducting data collection tasks, assessing and evaluating dataset quality, and analysing the robustness and generalisability of our models. You will work closely with cross-functional teams, including researchers, engineers, and data annotators, to conduct data-driven decision-making and improve the overall effectiveness of our AI systems. This role combines expertise in statistics, experimental design incl. human annotators, and machine learning to ensure that our models are trained on high-quality data and perform reliably across diverse scenarios. You will contribute to Cohere’s mission of advancing AI by ensuring our systems are robust, scalable, and impactful.
Job Responsibility:
Design and oversee data collection tasks, including supporting human annotators and ensuring data quality
Develop and apply statistical methods to evaluate the quality and reliability of datasets
Analyse and assess the generalisability and robustness of ML systems across diverse use cases
Collaborate with teams to improve dataset quality and model performance
Train and fine-tune large language models (LLMs) on distributed training infrastructures
Conduct experiments to evaluate model performance and identify areas for improvement
Requirements:
Extremely strong software engineering skills
Strong expertise in designing and conducting data collection tasks, including working with human annotators
Strong statistical skills and experience evaluating scientific experiments related to data collection and model performance
Experience analysing datasets with respect to their quality, biases, and suitability for training ML models
Hands-on experience training large language models (LLMs) on distributed training infrastructures
Familiarity with evaluating and improving the generalisability and robustness of ML systems
Proficiency in programming languages such as Python and ML frameworks (e.g., PyTorch, TensorFlow, JAX)
Excellent communication skills to collaborate effectively with cross-functional teams and present findings
One or more papers at top-tier venues (such as NeurIPS, ICML, ICLR, AIStats, MLSys, JMLR, AAAI, Nature, COLING, ACL, EMNLP)
What we offer:
An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in-office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top-up for up to 6 months
Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend