This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Contra Labs is a human-centered AI lab focused on creative and multimodal outputs, where human taste defines the next generation of AI capabilities. We build industry-leading creative preference datasets that power benchmarking, evaluation, and post-training for the world's leading AI models and applications. Our work helps define what 'good' looks like across design, video, imagery, and beyond. Built on top of Contra, the professional network for independent creatives, Contra Labs connects frontier AI labs with a global network of top creative talent. Together, we turn human judgment into the training and evaluation layer that enables AI models and tools to power the next generation of human creativity.
Job Responsibility:
Own client engagements end-to-end: scoping, project planning, evaluator team buildout, QA, and final delivery
Design annotation protocols, rubrics, and evaluation methodologies for creative and multimodal outputs
Build and manage evaluator teams from Contra's creative network, ensuring consistent quality through calibration and real-time course-correction
Build repeatable playbooks and partner with GTM on scoping and pricing new deals
Collaborate with engineering on tooling needs and contribute to Contra Labs' positioning through case studies and benchmark publications
Requirements:
Based in NYC, USA
In-office 5 days/week (Williamsburg)
Nice to have:
Familiarity with statistical methods (inter-rater reliability, sampling, regression analysis) and data analysis tools (Python, SQL)
Experience with RLHF, LLM evaluation, benchmark design, or human-in-the-loop systems
Background at a data labeling company (Scale, Surge, Appen, Invisible, etc.)