This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a highly analytical and strategic thinker to take ownership of our model evaluation analysis and insight generation. Our analysis has established a high standard for deep-dive analysis of model evaluation. We need someone who can not only maintain this cadence but elevate it, turning raw result data into a roadmap for model improvement.
Job Responsibility:
Own the creation of model evaluation from initial hypothesis, data scraping to final publication
Go beyond aggregate metrics (e.g., "Accuracy is 85%"). deeply analyze why the model failed on the other 15%. Identify semantic patterns, edge cases, and systemic hallucinations in raw model outputs
Review raw data sets, meeting transcripts, and research notes to identify the "so what?" We need to turn these findings into a logical hierarchy
You will act as the bridge between the data and the narrative by structuring findings into a logical hierarchy where the most critical "hook" lands first, followed by the supporting evidence
Requirements:
5 - 10 years of experience in DS, ML, AI research and analysis
Structured Thinker: You organize your writing logically
High Tolerance for Ambiguity: You can take a messy pile of notes and organize it into a coherent outline without needing your hand held
Executive Presence: You are comfortable interviewing senior leaders and pushing back when an "insight" isn't actually insightful
Cross Functionality: Be able to work cross functionally across ML researchers to clients
Nice to have:
Experience in Model Evaluation, ML Engineering or Technical Research
Experience designing or curating datasets (RLHF, SFT data)