This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We’re hiring a founding-level ML systems engineer to work in-person full-time in San Francisco (in Dogpatch). You will report directly to the cofounders. Your work will directly make the game more affordable and more fun.
Job Responsibility:
Build and run the infrastructure needed to rigorously tailor harnesses and prompts to each AI model individually to squeeze out maximum performance
Train domain-specific models to close or even eliminate the gap between open and closed models in their weight class at playing Pax
Reduce costs associated with closed source models by optimizing caching strategies
Further improve performance of closed source models by training tuned endpoints
Evaluate and improve embedding and reranker performance in the places we use them
Enable entirely new user experiences based on upcoming world models
Requirements:
You have shipped ML systems to real users and operated them in production
You have made explicit cost/quality tradeoffs in deployed systems
You have debugged and fixed unexpected model failures in production (e.g. expert hot-spots, structured output errors, etc.)
You have designed, critiqued, or iterated on evaluation frameworks and understand their failure modes
You bias toward leverage and compounding improvements (better evals, better feedback loops, better infrastructure)
You are willing to work on the “boring” but important problems like instrumentation, data hygiene, debugging, and reliability
You take ownership of problems and are comfortable advocating for your ideas (while remaining open to evidence)
You know when to say “no” to yourself and us when something isn’t worth the complexity or risk
Visa: US citizen/visa only
Nice to have:
Experience with preference modeling, pairwise ranking, or human-in-the-loop evaluation systems
Background in games, simulations, storytelling systems, or other domains where qualitative judgment matters
Experience operating systems at high request volume
Prior work at an early-stage startup or as a founding engineer
What we offer:
0.25% - 1+% equity
Vesting schedule is a 12 month cliff and 4 year monthly vesting