This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Meta is seeking a creative, skilled and motivated Research Scientist to advance the state-of-the-art in multi-modal understanding. You will work on developing models that reason across vision, language, and other modalities to enable richer AI experiences across Meta's family of apps and products. You will collaborate with research scientists, software engineers, and data scientists to design technical solutions in a fast-paced multidisciplinary environment.
Job Responsibility
Develop and advance multi-modal models that integrate vision, language, audio, and other modalities
Research novel architectures and training methods for cross-modal reasoning and understanding
Design and prototype interactive experiences that leverage multi-modal AI capabilities
Collaborate across teams to develop concepts that advance the entire research pipeline (hardware, software, data collection, machine learning, etc.)
Publish research findings at top-tier conferences and contribute to the broader research community
Requirements
Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
Currently has, or is in the process of obtaining, a PhD degree in Computer Science, Machine Learning, or relevant technical field. Degree must be completed prior to joining Meta
Experience in multi-modal learning, combining vision, audio, language, or related areas
Experience working with PyTorch or TensorFlow
Experience with transformer architectures and large-scale model training
Technical knowledge across machine learning, deep learning, and statistical modeling
Must obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment
Nice to have
First-authored publications at leading conferences such as NeurIPS, ICML, and CVPR, or similar
Experience with large language models (LLMs) and their integration with other modalities
Experience transferring multi-modal research into shipping products
Experience working and communicating cross-functionally in a team environment
Research experience in vision-language models, multi-modal transformers, or cross-modal representation learning