This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Our current work includes: Agentic pipelines — multi-step LLM systems with tool use, planning, and self-evaluation that automate complex marketing workflows end to end; Domain-adapted foundation models — fine-tuning open-weight LLMs (LoRA, RLHF, distillation) on proprietary WPP data for tasks like audience segmentation, creative scoring, and brand-safety classification; Retrieval-augmented generation — production RAG systems over large proprietary corpora (embedding models, vector indices, re-ranking) that serve real-time answers to client queries; Classical ML at scale — gradient-boosted models, causal inference pipelines, and recommendation engines that run alongside LLM components in hybrid architectures. You will be the technical lead for projects across these workstreams: scoping the problem, choosing the modelling approach, building the training and evaluation infrastructure, shipping to production, and iterating based on live metrics. You are not handing off a notebook to an engineering team — you ship what you build.
Job Responsibility:
Design and run training pipelines — data curation, model selection, hyperparameter search, ablation studies — and be accountable for model quality on live traffic
Build and maintain production inference services (latency budgets, batching strategies, quantisation, monitoring) that serve WPP's global client base
Architect agentic AI systems: define tool schemas, orchestration logic, evaluation criteria, and failure modes for multi-step LLM workflows
Work across the stack when needed — write the data pipeline, train the model, build the evaluation harness, deploy the service, and debug it when metrics drift
Set technical direction for your workstream: write design docs, make build-vs-buy decisions, and defend your approach with evidence
Mentor and set the quality standards for junior scientists
Requirements:
5+ years shipping ML models to production — you've dealt with data drift, silent failures, retraining cadences, and the gap between offline metrics and business outcomes
Deep, demonstrable expertise in at least one of: NLP/LLMs, computer vision, recommender systems, or causal inference
Hands-on experience with LLM fine-tuning (LoRA, RLHF, DPO) or building LLM-powered systems (agents, RAG, structured generation)