This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Research Intern in the Model Shaping team, you will work on one or more of the following areas: Advanced post-training methods across supervised learning, preference optimization, and reinforcement learning; New techniques and systems for efficient training of neural networks (e.g., distributed training, algorithmic improvements, optimization methods); Robust and reliable evaluation of foundation model capabilities. The Model Shaping team at Together AI works on products and research for tailoring open foundation models to downstream applications. We build services that allow machine learning developers to choose the best models for their tasks and further improve these models using domain-specific data. In addition to that, we develop new methods for more efficient model training and evaluation, drawing inspiration from a broad spectrum of ideas across machine learning, natural language processing, and ML systems.
Job Responsibility:
Research and implement novel techniques in one or more of our focus areas
Design and conduct rigorous experiments to validate hypotheses
Document findings in scientific publications and blog posts
Integrate the research results into Together products
Communicate the plans, progress, and results of projects to the broader team
Requirements:
Currently pursuing a Bachelor's, Master's, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field
Strong knowledge of Machine Learning and Deep Learning fundamentals
Experience with deep learning frameworks (PyTorch, JAX, etc.)
Strong programming skills in Python
Familiarity with Transformer architectures and recent developments in foundation models
Nice to have:
Prior research experience with foundation models or efficient machine learning
Publications at leading ML and NLP conferences (such as NeurIPS, ICML, ICLR, ACL, or EMNLP)
Understanding of model optimization techniques and hardware acceleration approaches
Contributions to open-source machine learning projects