This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Build out our next-gen Agent RL training platform; build out the platform that will train best-in-class Agents that achieve state of the art results on real enterprise use-cases; integrate cutting edge research into our training stack, enabling MLREs on the Enterprise AI team to deploy use-cases ranging from next-generation AI cybersecurity firewall LLMs to training foundation healthtech search models
Job Responsibility:
Train state of the art models, developed both internally and from the community, to deploy to our enterprise customers
Research cutting edge algorithms to integrate directly into our training stack
Design solutions that enable complex multi-agent systems to directly learn from both process + outcome based rewards
Requirements:
5+ years of LLM training in a production environment
Experience with post-training methods like RLHF/RLVR and related algorithms like PPO/GRPO etc.
Publications in top conferences such as NEURIPS, ICLR, or ICML within the last two years
PhD or Masters in Computer Science or a related field