This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The goal of a Staff Machine Learning Engineer at Scale is to lead the design and deployment of agentic AI systems that operate in real-world, mission-critical government environments. On the Public Sector team, you’ll work at the intersection of agentic ML, systems engineering, and applied research, building foundational infrastructure that enables AI systems to reason, plan, and act reliably at national scale. Our Public Sector ML Team partners directly with U.S. defense and intelligence agencies to deploy AI into classified and regulated environments. Through flagship programs like Donovan and Thunderforge, we are advancing the next generation of agentic AI for geospatial reasoning, planning, and decision support. Staff Machine Learning Engineers play a central role in setting technical direction, owning core architectures, and translating ambitious ideas into production systems trusted by government operators.
Job Responsibility:
Lead the architecture and implementation of agentic AI systems, with a focus on long-horizon reasoning, orchestration, and system-level reliability
Build and scale agents that perform complex geospatial reasoning, including interpreting, generating, and reasoning over maps and spatial data
Design and improve retrieval systems across large collections of static and semi-structured documents, enabling agents to surface high-signal context efficiently
Fine-tune and evaluate embedding models to improve recall and precision for mission-critical datasets
Design memory systems that allow agents to persist state, operate over long contexts, and learn from prior interactions
Own and evolve shared agentic infrastructure and core libraries, enabling reuse across teams, products, and Public Sector contracts
Define evaluation strategies for agentic systems, including robustness testing, failure-mode analysis, and regression testing in production environments
Partner closely with engineering managers, product leaders, and researchers to scope high-impact initiatives and unblock execution across teams
Serve as a technical mentor and multiplier—raising the bar for system design, ML rigor, and production readiness across the organization
Requirements:
8+ years of experience building and deploying applied ML systems in production environments
Deep experience with agentic systems, autonomous workflows, or ML systems that reason and act over multiple steps
Strong background in ML systems engineering, including model serving, pipelines, monitoring, and evaluation
Hands-on experience with retrieval systems, embeddings, or representation learning
Proficiency in Python and modern ML frameworks (ex: PyTorch), with the ability to design systems end to end
Demonstrated ability to operate at Staff-level scope: setting technical direction, owning ambiguous problems, and driving 0→1 initiatives to production
Experience making thoughtful tradeoffs across performance, cost, reliability, and development velocity
This role will require an active security clearance or the ability to obtain a security clearance
Nice to have:
High ownership over 0→1 systems that move directly into production
Real-world constraints that force thoughtful engineering tradeoffs, not just model tuning
Opportunity to shape foundational agentic infrastructure used across multiple teams and missions
Work that blends research depth with applied impact, in environments where correctness, robustness, and trust matter