This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Public Sector GenAI T&E Product Manager will be a high-horsepower technical leader, defining the vision and owning the roadmap for our evaluation capabilities. This role requires thriving in unscripted, high-stakes environments, as you will be the primary owner for the T&E tech stack—the robust infrastructure required to continuously measure, improve, and prove the superiority and sustained performance of our agentic applications.
Job Responsibility
Defining the vision and owning the roadmap for our evaluation capabilities
Being the primary owner for the T&E tech stack
Identifying bottlenecks, distilling technical friction into actionable plans, and driving execution
Working across Scale’s commercial and public sector teams to define requirements
Refining the tech stack that allows ML teams to hillclimb
Surfacing critical performance information to stakeholders
Requirements
Engineering Depth: 3+ years of experience in software engineering, systems architecture, or highly technical program management
Evaluation Systems Expertise: Proven experience designing, owning the roadmap for, or operating the infrastructure required to continuously measure, improve, and show the performance of AI applications
Problem Distillation: Demonstrated experience taking a vaguely defined problem and delivering a technical roadmap, resource requirements, and measurable success metrics within a narrow time window
Ambiguity Management: Proven track record of taking a project from stalled/undefined to shipped in a high-pressure environment
Cross-Functional Leadership: Led multiple projects that required direct alignment between at least three distinct engineering organizations
Operational Execution: Experience using technical project management frameworks to provide consistent weekly reporting on delivery velocity and blockers to executive stakeholders
Nice to have
Security Clearance: Active Secret, Top Secret, or TS/SCI clearance
GenAI Implementation: Practical experience developing or evaluating features built specifically on LLMs, RAG, or autonomous agent workflows
Technical Rigor: Advanced degree in Computer Science, Engineering, or a related field
Public Sector Expertise: 2+ years of experience working with DoD, IC, or Civil agencies on mission-critical software deployments