This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We're hiring a Principal AI/ML Engineer to lead the design and implementation of enterprise-grade AI systems that power mission-critical operations. In this role, you will architect scalable AI infrastructure, deliver practical AI solutions that solve real operational challenges, and drive technical strategy for our AI-powered products. You'll apply cutting edge AI to transform complex military requirements for planning critical operations into robust production-ready systems. You will have the opportunity to redefine distributed human-AI collaboration at scale with the use of multi modal foundation models, multi-agent orchestration and knowledge graph driven reasoning. Expect to lead cross-functional technical initiatives, mentor engineers, and establish technical standards that enable rapid innovation while ensuring reliability. The solutions you design must be robust, scalable, and able to support thousands of users in regulated, offline-capable environments.
Job Responsibility:
Partner with product, domain experts, and leadership to translate operational needs into technical roadmaps
Define system-level standards for model development, evaluation, and deployment, while mentoring senior and staff level engineers through collaboration
Drive long term AI/ML architecture at Onebrief
Agent Orchestration: Design systems that coordinate multiple AI agents, tools, and workflows to accomplish complex operational tasks
Design and implement enterprise-scale AI infrastructure supporting retrieval, generation, and multi-modal reasoning
Build and scale graph-based systems for structured representation, reasoning, and integration with generative models
Apply LLM practices like RAG & prompt engineering to deliver grounded & reliable outputs for mission planning
Establish evaluation frameworks and SLO's for RAG quality, agent reliability, and system performance in production
Collaborate with infrastructure teams to ensure high quality data pipelines to power AI applications
Requirements:
M.S. in Computer Science, Engineering, or equivalent practical experience
10+ years in large scale distributed systems, ideally supporting more than 100k concurrent users
Proven experience designing and deploying AI systems at scale in distributed, production environments
Strong background in integrating LLMs with retrieval systems for real-world use cases
Understanding of data governance, model safety evaluations, red-teaming, and secure ML practices for regulated domains
Experience with building systems with information retrieval (relevance & ranking), natural language processing techniques like Named Entity Recognition (NER)
Nice to have:
Background in defense, national security, or other mission-critical domains
Experience with orchestrating systems that combine text, structured data, and domain-specific signals
What we offer:
Equity: Share in the company's success
Flexible Work Environment: Remote-first organization* with flexible work hours and unlimited PTO
Comprehensive Health Coverage: Health, dental, vision, and life insurance
Retirement Plan: 401(k) plan with company match to secure your future
Parental Leave: 8 weeks at 100% regardless of state
Company Retreats: Annual company summit trips
Home Office Budget: $1,000 per year for home office improvements