This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Data Quality Engineer within Prolific AI Data Services, you will be the quality guardian for our managed service studies. You will design and operationalise the measurement systems, automation, and launch gates that ensure the data we deliver is trustworthy, authentic, and scalable. This role sits at the intersection of data quality, automation, and integrity. You’ll work closely with Product, Engineering, Operations, and Client teams to embed quality and authenticity into study design and execution—enabling faster launches without compromising trust as task types and volumes evolve.
Job Responsibility:
Own end-to-end quality design for Prolific managed service studies, including rubrics, acceptance criteria, defect taxonomies, severity models, and clear definitions of done
Define, implement, and maintain quality measurement systems, including sampling plans, golden sets, calibration protocols, agreement targets, adjudication workflows, and drift detection
Build and deploy automated quality checks and launch gates using Python and SQL, such as schema and format validation, completeness checks, anomaly detection, consistency testing, and label distribution monitoring
Design and run launch readiness processes, including pre-launch checks, pilot calibration, ramp criteria, full-launch thresholds, and pause/rollback mechanisms
Partner with Product and Engineering to embed in-study quality controls and authenticity checks into workflows, tooling, and escalation paths
Write and continuously improve guidelines and training materials to keep participants, reviewers, and internal teams aligned on evolving quality standards
Investigate quality and integrity issues end to end, running root-cause analysis across guidelines, UX, screening, training, and operations, and driving corrective and preventive actions (CAPAs)
Build dashboards and operating cadences to track defect rates, rework, throughput versus quality trade-offs, integrity events, and SLA adherence
Lead calibration sessions and coach QA leads and reviewers to improve decision consistency, rubric application, and overall quality judgement
Translate one-off quality fixes into repeatable, scalable playbooks across customers, programs, and study types
Requirements:
5+ years of experience in quality engineering, data or annotation quality, analytics engineering, trust and integrity, or ML/LLM evaluation operations
Strong proficiency in Python and SQL, with comfort applying statistical concepts such as sampling strategies, confidence levels, and agreement metrics
A proven track record of turning ambiguous or messy quality problems into clear metrics, automated checks, and durable process improvements
Strong quality systems thinking, with the ability to translate complex edge cases into clear rules, tests, rubrics, and governance mechanisms
Hands-on experience instrumenting workflows and implementing pragmatic automation that catches quality and integrity issues early
Demonstrated ability to influence cross-functional teams (Product, Engineering, Operations, Client teams) and drive change without direct authority
Strong customer empathy, with a clear understanding of what “useful, trustworthy data” means for research, AI training, and evaluation use cases
Nice to have:
Familiarity with data collection mechanics (screeners, quota/routing constraints, study design patterns)
LLM evals, red teaming, or policy-based annotation experience