CrawlJobs Logo

Research Scientist Intern, Multimodal AI

meta.com Logo

Meta

Location Icon

Location:
United States , Redmond

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

7650.00 - 12134.00 USD / Month

Job Description:

The Meta Reality Labs Research Team brings together a world-class team of researchers, developers, and engineers to create the future of virtual and augmented reality, which together will become as universal and essential as smartphones and personal computers are today. And just as personal computers have done over the past 45 years, AR, VR and MR will ultimately change everything about how we work, play, and connect. We are developing all the technologies needed to enable breakthrough AR glasses and VR headsets, including optics and displays, computer vision, audio, graphics, brain-computer interfaces, haptic interaction, eye/hand/face/body tracking, perception science, and true telepresence. Some of those will advance much faster than others, but they all need to happen to enable AR, VR and MR that are so compelling that they become an integral part of our lives. In particular, the Meta Reality Labs Research audio team is focused on two goals; creating virtual sounds that are perceptually indistinguishable from reality, and redefining human hearing. See more about our work here: Inside Facebook Reality Labs Research: The future of audio and Filter Out the Noise With Conversation Focus. These two initiatives will allow us to connect people by allowing them to feel together despite being physically apart, and allow them to converse in even the most difficult listening environments. Meta Reality Labs Research is looking for experienced interns who are passionate about ground breaking research in audio signal processing, machine learning and audio visual learning to solve important audio-driven problems for AR/VR applications. We currently have open positions for a range of projects in multimodal representation learning, audio visual scene analysis, egocentric audio visual learning, multi-sensory speech enhancement and acoustic activity localization. Our internships are twelve (12) to twenty four (24) weeks long and we have various start dates throughout the year.

Job Responsibility:

  • Design, implement, and maintain comprehensive evaluation protocols for large language models, including both automated and human-in-the-loop assessments
  • Develop and curate high-quality datasets and benchmarks to measure model performance, safety, fairness, and robustness across a variety of tasks and modalities
  • Analyze model outputs to identify strengths, weaknesses, and failure modes, and provide actionable insights to research and engineering teams
  • Design and implementation of novel algorithms to solve audio research problems
  • Collaboration with teams building Meta’s language AI products.. Collaborate with researchers, engineers, and cross-functional partners to define evaluation goals, communicate findings, and drive improvements in model quality
  • Develop tools and infrastructure to streamline and scale evaluation processes, including dashboards, annotation platforms, and reporting systems
  • Stay up-to-date with the latest research in audio LLM evaluation, benchmarking, and responsible AI, and incorporate best practices into Meta’s workflows
  • Disseminate evaluation results through internal reports, presentations, and, when appropriate, external publications

Requirements:

  • Currently has, or is in the process of obtaining, a PhD degree in the field of Computer Science, Artificial Intelligence, Generative AI, Transformer Models, Machine Learning, Signal Processing or Computer vision
  • 3+ years experience with Python, Matlab, or similar
  • 3+ years experience with machine learning software platforms such as PyTorch, TensorFlow, etc
  • Experience building novel audio computational models and LLM
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment

Nice to have:

  • Demonstrated software engineer experience via an internship, work experience, coding competitions, or widely used contributions in open source repositories (e.g. Github)
  • Experience in advancing AI techniques, including core contributions to open source libraries and frameworks in computer vision or audio processing
  • Experience with audio and speech quality assessment
  • Experience with multichannel audio processing
  • Experience in visual and acoustic scene analysis
  • Experience manipulating and analyzing complex, large scale, high-dimensionality data from varying sources
  • Proven track record of achieving significant results as demonstrated by grants, fellowships, patents, as well as first-authored publications at leading workshops or top computer vision and machine learning conferences such as NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR, ICCV, ECCV, ICASSP, InterSpeech or similar
  • Experience in utilizing theoretical and empirical research to solve problems
  • Experience working and communicating cross functionally in a team environment
  • Intent to return to a degree-program after the completion of the internship/co-op

Additional Information:

Job Posted:
February 19, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Scientist Intern, Multimodal AI

PhD AI Research Intern

Join our cutting-edge Machine Learning Research team at Atlassian as a PhD Resea...
Location
Location
Canada
Salary
Salary:
55.00 USD / Hour
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Completed Bachelors degree in Computer Science or a related field
  • Currently pursuing a PhD in Computer Science or a related field at any stage of your doctoral studies
  • Strong foundation in AI/ML, LLMs, modeling and/or optimization techniques
Job Responsibility
Job Responsibility
  • Collaborate cross-functionally with Research Scientists and Machine Learning Engineers to design, implement, and evaluate experiments that advance the performance, efficiency, and scalability of modern ML and LLM systems for our AI products
  • Curate, preprocess, and manage large-scale datasets for training and evaluation, ensuring data quality, diversity, and reproducibility across experiments
  • Conduct continued training, fine-tuning, and alignment of large language models for specialized applications such as conversational AI, summarization, generative search, and multimodal agents
  • Evaluate cutting-edge ML algorithms through rigorous experimentation and provide detailed analyses highlighting performance insights, failure modes, and opportunities for improvement
  • Contribute to publications and presentations at internal workshops or top-tier academic venues, helping to drive innovation in Enterprise AI and large-scale ML systems
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
Read More
Arrow Right

PhD AI Research Intern

Join our cutting-edge Machine Learning Research team at Atlassian as a PhD Resea...
Location
Location
United States , Seattle
Salary
Salary:
49.00 - 75.00 USD / Hour
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Completed Bachelors degree in Computer Science or a related field
  • Currently pursuing a PhD in Computer Science or a related field at any stage of your doctoral studies
  • Degree completion date cannot be earlier than September 2026 - June 2027
  • Strong foundation in AI/ML, LLMs, modeling and/or optimization techniques
  • Exhibit a solid grasp of algorithms and data structures
  • Demonstrate proficiency in Python programming and ability to write clean, efficient, and well-documented code
  • Experience working with large-scale datasets, including data preprocessing, augmentation, and scaling techniques
  • Has expertise in managing data using Python libraries such as NumPy, Pandas, Matplotlib, in addition to leveraging models from Hugging Face and has practical knowledge of applied machine learning and deep learning frameworks, like PyTorch
  • Demonstrated exposure to natural language processing (NLP) and Computer Vision (CV)
  • Familiarity with state-of-the-art research in machine learning and AI, as evidenced by relevant coursework, publications, or projects
Job Responsibility
Job Responsibility
  • Collaborate cross-functionally with Research Scientists and Machine Learning Engineers to design, implement, and evaluate experiments that advance the performance, efficiency, and scalability of modern ML and LLM systems for our AI products
  • Curate, preprocess, and manage large-scale datasets for training and evaluation, ensuring data quality, diversity, and reproducibility across experiments
  • Conduct continued training, fine-tuning, and alignment of large language models for specialized applications such as conversational AI, summarization, generative search, and multimodal agents
  • Evaluate cutting-edge ML algorithms through rigorous experimentation and provide detailed analyses highlighting performance insights, failure modes, and opportunities for improvement
  • Contribute to publications and presentations at internal workshops or top-tier academic venues, helping to drive innovation in Enterprise AI and large-scale ML systems
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
Read More
Arrow Right

Research Scientist Intern, AI Research - Multimodal Pretraining

Meta is seeking Research Scientist Interns in the multimodal pretraining team in...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has or is in the process of obtaining a Ph.D. degree in Computer Science, Machine Learning, Computer Vision, Artificial Intelligence, or relevant technical field
  • Past projects/publications in the general domain of neural scaling laws, model architectures, image/text modeling, vision-language modeling
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
  • Experience in PyTorch, Triton, or other related programming languages
  • Experience building systems based on machine learning and/or deep learning methods
Job Responsibility
Job Responsibility
  • Perform research to advance the frontiers of multimodal (images, video, text, audio, and other modalities) pretraining, to develop the next generation of multimodal architectures
  • Collaborate with researchers and cross-functional partners including communicating research plans, progress, and results
  • Publish research results and contribute to research that can be applied to Meta product development
Read More
Arrow Right

Research Scientist Intern, AI Research - CoreML - World Models

Meta is seeking Research Interns to join the SAM team in the Multimedia Percepti...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in Computer Vision, Machine Learning, Artificial Intelligence, or relevant technical field
  • Research and/or work experience in Generative Modeling and Computer Vision. In particular: video generation, 3D/4D reconstruction, video and image understanding, vision-language foundation models, representation learning, and related areas
  • Research and/or work experience in Machine Learning or Deep Learning with applications to perception
  • Experience in Python, C++, or other related languages
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Perform research to advance the science and technology of generative AI
  • Perform research that enables learning to predict and condition on multimodal data (video, 3D structures, primarily images, text, and other modalities like audio)
  • Brainstorm with research mentors, review literature and existing solutions of a challenging real-world research problem
  • Develop novel solutions, implement prototypes, and perform extensive experiments to test the proposed solutions in meaningful benchmarks and metrics, analyze the results and verify the conclusions
  • Contribute to ongoing research projects and impactful technology releases
  • Draft and polish research publications
  • Present research outcomes to internal and/or external audiences
Read More
Arrow Right

Research Scientist Intern, AI Research - World Models

Meta is seeking Research Interns to join the SAM team in the Multimedia Percepti...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in Computer Vision, Machine Learning, Artificial Intelligence, or relevant technical field
  • Research and/or work experience in Generative Modeling and Computer Vision. In particular: video generation, 3D/4D reconstruction, video and image understanding, vision-language foundation models, representation learning, and related areas
  • Research and/or work experience in Machine Learning or Deep Learning with applications to perception
  • Experience in Python, C++, or other related languages
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Perform research to advance the science and technology of generative AI
  • Perform research that enables learning to predict and condition on multimodal data (video, 3D structures, primarily images, text, and other modalities like audio)
  • Brainstorm with research mentors, review literature and existing solutions of a challenging real-world research problem
  • Develop novel solutions, implement prototypes, and perform extensive experiments to test the proposed solutions in meaningful benchmarks and metrics, analyze the results and verify the conclusions
  • Contribute to ongoing research projects and impactful technology releases
  • Draft and polish research publications
  • Present research outcomes to internal and/or external audiences
Read More
Arrow Right

Research Scientist Intern, Real-Time Multimodal AI

Reality Labs is building the future of connection through world-class AR/VR hard...
Location
Location
United States , Burlingame
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in Computer Science, Machine Learning, Electrical Engineering, or a related field
  • 2+ years of research experience in one or more of the following areas: multimodal learning, vision-language models, large language models, or foundation model fine-tuning
  • Hands-on experience fine-tuning large foundation models (e.g., LLaVA, InternVL, Qwen-VL, LLaMA, or similar)
  • Strong programming skills in Python
  • Experience with deep learning frameworks such as PyTorch
  • Excellent communication skills and ability to work independently
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Research and develop novel approaches for fine-tuning large multimodal foundation models (vision-language, audio-visual) for real-time applications
  • Design and implement efficient inference pipelines for deploying fine-tuned models in real-time communication scenarios
  • Explore agentic architectures that leverage fine-tuned models as tools within larger AI systems
  • Collaborate with cross-functional teams to integrate models into prototype experiences
  • Document and present research progress with the goal of publishing findings at top-tier ML/CV conferences
  • Contribute to building working prototypes that demonstrate the capabilities of fine-tuned multimodal models
Read More
Arrow Right

Applied Scientist Intern: Multimodal Conversational AI

Microsoft Teams is the hub for teamwork that integrates all the people, content,...
Location
Location
United Kingdom , Cambridge
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently enrolled in a PhD program (or published candidate in MSc program) in Computer Science, Electrical or Computer Engineering, Statistics, or a related field
  • Practical experience in training, fine-tuning, transformer models or LLMs e.g., using text, audio and/or images
  • Practical Python coding experience leveraging PyTorch or TensorFlow or similar framework
  • Excellent analytical, coding, communication, and collaborative skills
Job Responsibility
Job Responsibility
  • Conduct experiments, create and validate metrics, and develop candidate algorithms to improve live voice conversation experiences with AI agents by leveraging real-time multimodal data
  • Collaborate closely with CMD Labs researchers and engineers to leverage existing assets, datasets, and ensure results can be leveraged back into the product
  • Embody Microsoft culture and values
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, Multimodal Generative AI and Robotics

The research intern will work on cutting edge research problems to innovate nove...
Location
Location
United States , Redmond
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining a PhD degree in the domain of computer-vision, computer graphics, 3D machine perception or deep learning
  • Knowledge in deep learning, computer vision, graphics, generative modeling, LLMs and VLMs
  • Hands-on experience with implementing deep learning algorithms, large-scale training, benchmark and evaluation
  • Experience working within Python environments such as pytorch
  • Experience working in a Unix environment
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Plan and execute cutting-edge research and development to advance the state-of-the-art in machine learning and large-scale training
  • Collaborate with other researchers and engineers across machine perception teams at Meta to develop experiments, prototypes, and concepts that advance the state-of-the-art contextual AI and robotic systems
  • Work with the team to help design, setup, and run practical experiments and prototype systems related to large-scale high-quality sensing and machine reasoning
  • Fulltime
Read More
Arrow Right