This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a Principal Engineer to lead the design, evolution, and reliability of our global financial data extraction and normalization platform. This system ingests filings across formats (HTML, PDF, APIs, spreadsheets), extracts and standardizes financial data using a combination of AI and deterministic systems, and delivers trusted, traceable outputs at scale to downstream financial services teams and customers. This role owns the self-sourcing and extraction foundation used by multiple teams across the organization. Today, human validation is used as a safety net. Your mission is to systematically reduce human dependency by increasing system correctness, confidence, and trust, without sacrificing speed or scalability. You will operate at the intersection of distributed systems, AI-driven extraction, data quality, and platform architecture, raising the technical bar and reshaping how we build reliable systems.
Job Responsibility:
Own the architecture and evolution of our large-scale data extraction and normalization platform
Design systems that process hundreds of thousands of records across heterogeneous sources (PDF, HTML, APIs, XLS) with high reliability
Define and implement strategies to minimize human-in-the-loop validation through AI-based validation, confidence scoring, provenance tracking, and deterministic safeguards
Establish clear system contracts for correctness, traceability, and confidence (e.g., “where did this number come from?”)
Balance AI-driven approaches with procedural and rules-based systems where they improve reliability and explainability
Identify and remediate architectural and operational bottlenecks impacting scale, accuracy, and developer velocity
Act as the technical authority to block poor designs, redesign critical systems, and introduce new platforms or tooling when necessary
Partner with product and downstream consumers to define quality bars, SLAs, and success metrics
Serve as a technical escalation point for production issues, reliability failures, and systemic risks
Mentor senior engineers, shape technical culture, and raise expectations for system design and execution
Contribute to long-term technical strategy while remaining hands-on with critical implementations
Requirements:
15+ years of experience building and operating large-scale production systems
Proven experience designing data ingestion, extraction, or processing systems at scale
Deep expertise in one or more of: Distributed systems, Data platforms and pipelines, AI/ML-powered extraction or classification systems, Platform or infrastructure engineering
Demonstrated ability to design trustworthy systems that combine probabilistic (AI) and deterministic approaches
Strong understanding of system reliability, observability, failure modes, and iterative hardening
Experience reducing operational or human overhead through better system design
Track record of technical leadership across teams without formal people management
Comfortable operating in ambiguity and driving clarity where none exists
Passion for using AI responsibly and effectively to deliver scalable, high-confidence systems
Nice to have:
Experience with document understanding, NLP, or financial data extraction
Experience building provenance, lineage, or confidence-scoring systems
Familiarity with cloud-native architectures and modern data stacks
Experience shaping or owning internal platforms used by multiple teams