This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Our client is a leading global investment management company headquartered in London. It manages over $228 billion in assets and serves institutional investors, pension funds, wealth managers, and other sophisticated clients worldwide. The firm specializes in quantitative investing, alternative investments, systematic trading strategies, and technology-driven asset management. Data science, machine learning, and AI are core components of its investment and research processes. As part of our collaboration we will focus on two foundational capabilities required to enable safe and scalable AI adoption across the enterprise: Agentic Security and AI-Ready Data Foundations. What project we have for you: We build the data foundations that make AI useful and safe inside regulated financial firms. The value of AI is capped by the data its agents can reach: if an agent cannot find, interpret, trace or be correctly permissioned against data, the capability is useless, or worse, unsafe. Your job is to close that gap. This is a hands-on senior role for an excellent Python engineer with strong data-engineering skills who is genuinely comfortable building with AI agents. You will design and build the catalogue, semantic, entitlement and analytical layers that turn large on-premise data estates into something agents can use.
Job Responsibility
Build production-grade Python services and data pipelines over large data stores (columnar / time-series and relational), and the queries that join across them
Select and implement the right query or analytical engine for each workload, rather than defaulting to one
Build catalogue, metadata, lineage and semantic layers that make data discoverable and consistently understood across teams
Implement access control that travels with the data: fusing sensitivity and licensing scope, enforced at the point of use, including for AI agents
Build agent-facing data access: retrieval (RAG), vector search, and APIs / MCP servers, with permissions applied before context reaches the model
Apply LLMs pragmatically to data work (metadata generation, classification, entity resolution) with humans in the loop and evaluate the quality of what the agents produce
Help keep data trustworthy: establish golden sources, deduplication and data-quality checks at the source
Contribute to discovery and solutioning: assessing current state, weighing build-vs-adopt, and shaping pragmatic, costed plans
Requirements
6+ years building production software in Python, with strong engineering fundamentals (testing, performance, clean design)
Solid data engineering: SQL, columnar formats (e.g. Parquet), pipeline design, and handling datasets large enough that naive approaches don't scale
Hands-on experience with at least one analytical or query engine (e.g. DuckDB, Trino, Spark, ClickHouse)
Real experience building LLM / agent applications: retrieval (RAG), vector databases, and tool / function calling
A working understanding of data governance: cataloguing, metadata, lineage, and access control (RBAC / ABAC)
An instinct for data quality and trustworthy 'golden' sources