This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Harvey is building the AI platform for the world’s top legal and professional services teams. Our users rely on fast, accurate access to external legal data to perform research that underpins their most important work. As we scale, our Data & Knowledge organization sits at the center of this mission—turning raw, fragmented information into intelligent systems that power research and reasoning at global scale. Our Data Team is responsible for ingesting, structuring, understanding, and retrieving millions of documents across jurisdictions, formats, and domains. Whether public or private, offline or online, our mission is to organize the world’s legal information and make it accessible, reliable, and useful. We’re looking for a Product Manager with deep search, retrieval, and data platform expertise to lead the next generation of Harvey’s data engine as we 100× our capabilities. You will shape the strategy, roadmap, and architecture behind the systems that make advanced reasoning possible. The team owns end-to-end RAG (retrieval-augmented generation) pipelines across domains such as case law, legislation, tax code, and IP law across 50+ jurisdictions. As generation quality continues improving, retrieval quality has become the new frontier—and one of Harvey’s highest priorities.
Job Responsibility:
Drive the roadmap and strategy for Harvey’s “Data Factory”, ensuring we scale our data 100x through new platforms that build the ‘legal index’ of the world
Work with internal operations and external data providers to methodically expand coverage, accelerate execution, and improve dataset quality
Own and evolve Harvey’s end-to-end data architecture—from ingestion and transformation to storage, indexing, and retrieval—ensuring performance, reliability, and scalability for LLM-powered products
Partner with Applied AI engineers to build and optimize retrieval systems, embeddings, search models, and evaluation frameworks
Architect and oversee large-scale ingestion pipelines that aggregate, normalize, and continuously update millions of heterogeneous legal documents across global jurisdictions
Collaborate cross-functionally with Product Engineering, Applied AI, Research, and Platform teams to deliver high-quality production systems that support reasoning, summarization, and legal research workflows
Requirements:
5+ years of experience building or managing search, retrieval, recommendation, or data platforms at scale
Experience working with complex, heterogeneous, or domain-specific datasets with structured + unstructured data
Understanding of modern retrieval methods, including hybrid search (lexical + vector), dense retrieval, re-ranking, embeddings, chunking strategies, and index optimization
Hands-on experience with LLMs or RAG frameworks (evaluation, grounding, hybrid pipelines, query rewriting, LLM-as-a-judge, retrieval metrics)
Ability to partner with engineers on technical architecture, with enough depth to challenge assumptions, propose solutions, and influence design
A product mindset for search—balancing user needs, domain complexity, and system constraints to propose high-leverage improvements