CrawlJobs Logo

Research Scientist / Engineer — Video / Audio Generation

United States, Palo Alto 250000.00 - 450000.00 USD / Year · Job Posted February 10, 2026
Apply Position
Job Link Share

Job Description

This is a rare and foundational opportunity to define the future of creative AI. You will be at the forefront of building and training large-scale multimodal generative models, directly impacting how users create and interact with video and audio. This role offers the chance to bridge cutting-edge research with magical, shipped products, working end-to-end on novel problems with no existing playbook.

Job Responsibility

  • Architect large-scale video and audio generative models, focusing on strong temporal coherence and high perceptual quality
  • Design, implement, and run robust data pipelines for curating, filtering, and captioning massive video and audio datasets
  • Train large-scale video and audio generative models on massive datasets and GPU clusters
  • Define and build novel evaluation frameworks to measure realism, temporal consistency, controllability, and human-aligned creative quality

Requirements

  • Strong foundation in machine learning and generative modeling, with experience in video, audio, or multimodal domains
  • Deep understanding of autoregressive, diffusion/flow-based, or hybrid generative models, and their tradeoffs for long-horizon generation
  • Hands-on experience with PyTorch and large-scale training (distributed, mixed precision, large datasets)

Nice to have

  • Experience in the following around data, modeling, or evaluation: Text-to-video/audio models
  • Vision language models
  • Audio language models

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Research Scientist / Engineer — Video / Audio Generation

8 matching positions

Research Scientist / Engineer – Pre-training / Scaling

At Luma, the Pre-Training / Scaling team is responsible for building the core mu...
Location
Location
United States , Palo Alto
Salary
Salary:
187500.00 - 395000.00 USD / Year
lumalabs.ai Logo
Luma AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Expertise in Python and PyTorch with experience building ML models from scratch
  • Deep understanding of multimodal generative models and deep learning architectures
  • (Preferred) Strong research track record in generative AI with published work in top-tier venues preferred
  • (Preferred) Experience with large-scale distributed training systems
Job Responsibility
Job Responsibility
  • Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
  • Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
  • Develop training methodologies for foundation models across thousands of GPUs
  • Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models
  • Collaborate with cross-functional teams to transition research into production systems
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, Audio Quality with AI (PhD)

The Meta Reality Labs Research Team brings together a world-class team of resear...
Location
Location
United States , Redmond
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in the field of Speech and Hearing Science, Auditory Neuroscience, Computational Neuroscience, Computer Science, Artificial Intelligence, Generative AI, Transformer Models, Machine Learning, Signal Processing or Computer vision
  • 3+ years experience with Python, Matlab, or similar
  • 3+ years experience with machine learning software platforms such as PyTorch, TensorFlow, etc
  • Background in speech perception, psychoacoustics, or acoustic phonetics
  • Experience deploying novel audio computational models and LLMs
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Investigate systematic phonemic errors as causal factors in perceived speech quality degradation, and link them to human perceptual judgments
  • Build and curate datasets and benchmarks of speech for phoneme-level analysis
  • Explore and compare the capabilities of audio and video (multimodal) LLMs as tools to support this analysis
  • Relate findings to human perceptual data (quality preference and intelligibility) and translate them into actionable insights for research and engineering teams
  • Where appropriate, adapt multimodal models to the task in a supporting capacity
  • Collaborate with researchers, engineers, and cross-functional partners to define goals, communicate findings, and drive improvements in speech quality
  • Develop tools and infrastructure to streamline and scale the analysis
  • Stay current with research in speech perception and audio quality and intelligibility assessment, and incorporate best practices into Meta's workflows
  • Disseminate results through internal reports and presentations, and, when appropriate, external publications
What we offer
What we offer
  • benefits
  • Fulltime
Read More
Arrow Right

AI Research Scientist, Multimodal Generation

Meta is seeking an AI Research Scientist to join our Multimodal Generation Resea...
Location
Location
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science, Machine Learning, or a relevant technical field
  • Practical experience with pre-training, mid-training or SFT data curation for large foundational models and experience working with organic, synthetic, agentic, or reasoning data for Multimodal LLMs
  • Direct experience in Generative AI and LLM research
  • Programming experience in Python and hands-on experience with frameworks such as PyTorch
Job Responsibility
Job Responsibility
  • Develop algorithms based on state-of-the-art machine learning and neural network methodologies
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Post-train foundation models using techniques such as Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), Direct Preference Optimization (DPO), and Low-Rank Adaptation (LoRA)
  • Work towards long-term research/development goals, while identifying intermediate milestones
  • Conduct research that enables learning the semantics of data across multiple modalities (audio, images, video, text, and other modalities)
  • Prioritize research that can be applied to Meta's product development
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Tech Lead, AI Research Scientist (Robotics)

Meta is seeking an AI Research Scientist to join Meta Superintelligence Labs. In...
Location
Location
United States , Menlo Park
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD degree in the field of Artificial Intelligence, Robotics, Computer Vision, Machine Learning, Language, a related field, or equivalent practical experience
  • Experience with any of the following research areas: robotics, motion planning, embodied AI, human-robot interaction, sim-to-real transfer, learning from demonstration, reinforcement learning, dexterous manipulation, digital agents, vision language models, computer vision, egocentric perception, and/or LLMs
  • 5+ years of industry experience in relevant robotics related research areas, such as: robot learning, reinforcement learning, imitation learning, action-conditioned world models, task and motion planning, sim-to-real transfer robotic control, manipulation, navigation, or generally embodied AI
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Perform fundamental and applied research to push the scientific and technological frontiers of embodied artificial intelligence
  • Invent/improve novel data-driven paradigms for robotics, leveraging a variety of modalities (images, video, text, audio, tactile, etc)
  • Investigate paradigms that can deliver a spectrum of embodied behaviors - from simulated characters to real robots, and from short-horizon, low-level to long-horizon, high-level intelligence
  • Develop algorithms based on state-of-the-art machine learning and neural network methodologies
  • Define, build and benchmark new capabilities needed for the next generation of AI
  • Conduct research towards long-term research goals while identifying intermediate milestones
  • Lead, plan, and execute novel research based on long-term objectives of the organization
  • Set research strategy and direction, and provide mentorship for a team of researchers
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

AI Research Scientist, Robotics

Meta is seeking a Research Scientist to join Meta Superintelligence Labs. Indivi...
Location
Location
United States , Menlo Park
Salary
Salary:
184000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD degree in the field of Artificial Intelligence, Robotics, Computer Vision, Machine Learning, Language, a related field, or equivalent practical experience
  • Experience with any of the following research areas: robotics, motion planning, embodied AI, human-robot interaction, sim-to-real transfer, learning from demonstration, reinforcement learning, dexterous manipulation, digital agents, vision language models, computer vision, egocentric perception, and/or LLMs
  • 2+ years of industry experience in relevant robotics related research areas, such as: robot learning, reinforcement learning, imitation learning, action-conditioned world models, task and motion planning, sim-to-real transfer robotic control, manipulation, navigation, or generally embodied AI
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Perform fundamental and applied research to push the scientific and technological frontiers of embodied artificial intelligence
  • Invent/improve novel data-driven paradigms for robotics, leveraging a variety of modalities (images, video, text, audio, tactile, etc)
  • Investigate paradigms that can deliver a spectrum of embodied behaviors - from simulated characters to real robots, and from short-horizon, low-level to long-horizon, high-level intelligence
  • Develop algorithms based on state-of-the-art machine learning and neural network methodologies
  • Define, build and benchmark new capabilities needed for the next generation of AI
  • Conduct research towards long-term research goals while identifying intermediate milestones
  • Lead, plan, and execute novel research based on long-term objectives of the organization
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

AI Research Scientist (Technical Leadership), Multimodal - Monetization GenAI

The Monetization GenAI Video Gen & Visual Search group, part of the Ads pillar, ...
Location
Location
United States , Menlo Park, CA
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Has obtained a PhD in Computer Science, AI/ML, or a relevant technical field
  • Experience as a technical lead, driving major technical initiatives with cross-functional impact and influencing strategy across multiple teams
  • 4+ years of experience training large language and/or vision models, with extensive and recent experience training multimodal LLMs
  • Research expertise in video generation/understanding, multimodal learning, or diffusion models
  • Demonstrated significant industry influence in the field of AI and/or recently published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV)
Job Responsibility
Job Responsibility
  • Lead end-to-end AI research and model development for video-centric generative AI across Meta's advertising surfaces
  • Drive advancements in video generation & enhancement
  • Develop video-to-video & audio generation capabilities
  • Advance video & visual understanding through novel research
  • Conduct foundation model research to support generative AI innovation
  • Define research agendas and pioneer new directions in video/audio generation and multimodal understanding
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

AI Research Scientist, Robotics

The ideal Research Scientist candidate will use their skills in system design an...
Location
Location
United States , Redmond
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Currently has or is in the process of obtaining a PhD degree in the field of Artificial Intelligence, Robotics, Computer Vision, Machine Learning, Language, a related field, or equivalent practical experience
  • Experience with any of the following research areas: robotics, motion planning, embodied AI, human-robot interaction, sim-to-real transfer, learning from demonstration, reinforcement learning, dexterous manipulation, digital agents, vision language models, computer vision, egocentric perception, and/or LLMs
  • Experience in relevant robotics related research areas, such as: VLM, robot learning, reinforcement learning, imitation learning, action-conditioned world models, task and motion planning, sim-to-real transfer robotic control, manipulation, navigation, or generally embodied AI
Job Responsibility
Job Responsibility
  • Perform fundamental and applied research to push the scientific and technological frontiers of embodied artificial intelligence
  • Invent/improve novel data-driven paradigms for robotics, leveraging a variety of modalities (images, video, text, audio, tactile, etc)
  • Investigate paradigms that can deliver a spectrum of embodied behaviors - from simulated characters to real robots, and from short-horizon, low-level to long-horizon, high-level intelligence
  • Develop algorithms based on state-of-the-art machine learning and neural network methodologies
  • Define, build and benchmark new functionalities needed for the next generation of AI
  • Conduct research towards long-term product goals while identifying intermediate milestones
  • Plan and execute novel research based on long-term objectives of the organization
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Scientist, Robotics

At Meta, we’re building the future of human connection and the technology that e...
Location
Location
United States , Redmond
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD degree in the field of Artificial Intelligence, Robotics, Computer Vision, Machine Learning, Language, a related field, or equivalent practical experience
  • Experience with any of the following research areas: robotics, motion planning, embodied AI, human-robot interaction, sim-to-real transfer, learning from demonstration, reinforcement learning, dexterous manipulation, digital agents, vision language models, computer vision, egocentric perception, and/or Large Language Models
  • 5+ years of industry experience in relevant robotics related research areas, such as: Vision Language Models robot learning, reinforcement learning, imitation learning, action-conditioned world models, task and motion planning, sim-to-real transfer robotic control, manipulation, navigation, or generally embodied AI
Job Responsibility
Job Responsibility
  • Perform fundamental and applied research to push the scientific and technological frontiers of embodied artificial intelligence
  • Invent/improve novel data-driven paradigms for robotics, leveraging a variety of modalities (images, video, text, audio, tactile, etc.)
  • Investigate paradigms that can deliver a spectrum of embodied behaviors - from simulated characters to real robots, and from short-horizon, low-level to long-horizon, high-level intelligence
  • Develop algorithms based on state-of-the-art machine learning and neural network methodologies
  • Define, build and benchmark new functionality needed for the next generation of AI
  • Conduct research towards long-term product goals while identifying intermediate milestones
  • Lead, plan, and execute novel research based on long-term objectives of the organization
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right