This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Prolific is not just another player in the AI space – we are the architects of the human data infrastructure that's reshaping the landscape of AI development. In a world where foundational AI technologies are increasingly commoditized, it's the quality and diversity of human-generated data that truly differentiates products and models. The future of AI development relies on a critical, indispensable component: high-quality human data. Prolific provides the world's largest and most trusted source of this data to the teams pushing the boundaries of AI technology. As a Senior ML/LLMOps Engineer, you will be the backbone of our AI production lifecycle. You will bridge the gap between research and real-world application, ensuring our Data Scientists and AI Researchers have the high-performance infrastructure, automated pipelines, and deployment strategies needed to ship state-of-the-art models at scale. We deploy models and infrastructure responsible for a host of AI tasks, ranging from fraud detection to RAG based search.
Job Responsibility:
Infrastructure & Platform Engineering: Design and maintain scalable cloud environments (GCP/AWS) using Terraform
Manage GPU/TPU resource allocation for training, fine-tuning, and interactive notebooks
Build internal services and CLI tools to streamline the developer experience for the AI team
ML & LLM Orchestration: Design CI/CD/CT (Continuous Training) pipelines using tools such as GitHub Actions, MLFlow, Vertex AI Pipelines
Develop reusable patterns for model serving
Managing service deployments to Kubernetes
Manage and optimize vector databases and embedding pipelines for RAG-based systems
Performance & Optimization: Implement techniques to reduce latency and increase throughput
Solve scaling bottlenecks for serverless or containerized model deployments
Optimize GPU utilization and cloud spend without compromising performance
Observability & Reliability: Monitor for model drift, data skew, and resource utilization
Implement LLM Tracing to monitor prompts, agent actions and general service health
Requirements:
5+ years experience with cloud infrastructure and infrastructure as code
Previous experience with the ML and LLM lifecycle - training, hosting, optimisation, observability
Used to working closely with researchers and data scientists - taking experiments from worksheets into production
Strong grasp of ML fundamentals and modern GenAI stack