This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The integration team is responsible for developing and scaling machine learning algorithms and infrastructure for LLM post-training, with a focus on large-scale, distributed RL methods. We strive for excellence in both engineering and science by meticulously designing experiments and design docs. While tasks are assigned according to everyone’s expertise, there is a global team effort to write production code and support the team research efforts, depending on individual interests and organizational needs. In particular, this role aims to enhance the global quality of the post-training codebase by implementing new tools to ease and support research, optimizing post-training algorithms, and scaling distributed RL to unprecedented levels.
Job Responsibility:
Design and write high-performing and scalable software for training models
Develop new tools to support and accelerate research and LLM training
Coordinate with other engineering teams (Infrastructure, Efficiency, Serving) and the scientific teams (Agent, Multimodal, Multilingual, etc.) to create a strong and integrated post-training ecosystem
Craft and implement techniques to improve performance and speed up our training cycles, both on SFT, offline preference, and the RL regime
Research, implement, and experiment with ideas on our cluster and data infrastructure
Collaborate, Collaborate, and Collaborate with other scientists, engineers, and teams!
Requirements:
Extremely strong software engineering skills
Value test-driven development methods, clean code, and strive to reduce technical debts at all levels
Proficiency in Python and related ML frameworks such as JAX, Pytorch and/or XLA/MLIR
Experience using and debugging large-scale distributed training strategies (memory/speed profiling)
[Bonus] Experience with distributed training infrastructures (Kubernetes) and associated frameworks (Ray)
[Bonus] Hands-on experience with the post-training phase of model training, with a strong emphasis on scalability and performance
[Bonus] Experience in ML, LLM and RL academic research
Nice to have:
Experience with distributed training infrastructures (Kubernetes) and associated frameworks (Ray)
Hands-on experience with the post-training phase of model training, with a strong emphasis on scalability and performance
Experience in ML, LLM and RL academic research
What we offer:
An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in-office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top-up for up to 6 months
Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend