CrawlJobs Logo

AI Researcher (Multimodal Perception Models)

tavus.io Logo

Tavus

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We’re looking for an AI Researcher to join our core AI team and help push the frontier of multimodal conversational intelligence. If you thrive in fast-paced environments, love turning abstract ideas into running code, and get energy from exploring the edge of what’s possible then this is the perfect role for you.

Job Responsibility:

  • Conduct research on Foundational Multimodal Models in the context of Conversational Avatars (e.g., Neural Avatars, Talking-Heads)
  • Model video, audio, and language sequences using Autoregressive, Predictive Architectures (e.g., V-JEPA), and/or Diffusion paradigms with an emphasis on temporal and sequential data rather than static images
  • Collaborate with the Applied ML team to bring your work to life in production systems
  • Stay at the cutting edge of multimodal learning and help us define what “cutting edge” means next

Requirements:

  • A PhD (or near completion) in a relevant field, or equivalent hands-on research experience
  • Experience modeling human behavior and generation (facial expressions, affect, or speech). Ideally in conversational or interactive settings
  • Deep understanding of sequence modeling in video/audio/language domains
  • Familiarity with large model training, especially LLMs or VLMs
  • Strong background in Deep Learning (from Transformers to Diffusion Models) and how to make them work in practice
  • Excellent programming skills, especially in PyTorch

Nice to have:

  • Publications in top-tier conferences like CVPR, ICCV, NeurIPS, ECCV, or ACMMM
  • Broader understanding of generative AI and multimodal architectures
  • Familiarity with software engineering best practices
  • Curiosity and a flexible mindset — you like building and experimenting
What we offer:
  • flexible work schedule
  • unlimited PTO
  • competitive healthcare
  • gear stipends

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for AI Researcher (Multimodal Perception Models)

Senior Machine Learning Engineer, Perception

We are seeking a highly skilled Machine Learning Engineer with deep expertise in...
Location
Location
United States , Santa Clara
Salary
Salary:
145000.00 - 200000.00 USD / Year
plus.ai Logo
PlusAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Ph.D. or Masters in AI, Computer Science, Electrical Engineering, Robotics, or a related field
  • Ph.D. new grad or Masters + 3 years industry experience
  • Proficiency in Python and experience building deep learning pipelines
  • Strong expertise in PyTorch, TensorFlow, or JAX
  • Proven experience with LiDAR-based 3D perception and BEV representation models
  • Deep understanding of multimodal sensor fusion architectures and techniques
  • Familiarity with camera, LiDAR, and radar modalities and their synchronization, calibration, and integration in perception pipelines
  • Solid foundation in computer vision, deep learning, and 3D geometry
Job Responsibility
Job Responsibility
  • Design, implement, and optimize BEV-based perception models that fuse camera, LiDAR, and radar inputs
  • Benchmark perception models using large-scale datasets and well-defined quantitative metrics
  • Collaborate cross-functionally with research, data, and deployment engineers to refine models and support real-world applications
  • Maintain a strong focus on performance, robustness, and scalability for deployment in production systems
  • Ensure that your work is performed in accordance with the company’s Quality Management System (QMS) requirements and contribute to continuous improvement efforts
  • Ensure team compliance with QMS, monitor quality, and drive process improvements
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, AI Research - CoreML - World Models

Meta is seeking Research Interns to join the SAM team in the Multimedia Percepti...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in Computer Vision, Machine Learning, Artificial Intelligence, or relevant technical field
  • Research and/or work experience in Generative Modeling and Computer Vision. In particular: video generation, 3D/4D reconstruction, video and image understanding, vision-language foundation models, representation learning, and related areas
  • Research and/or work experience in Machine Learning or Deep Learning with applications to perception
  • Experience in Python, C++, or other related languages
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Perform research to advance the science and technology of generative AI
  • Perform research that enables learning to predict and condition on multimodal data (video, 3D structures, primarily images, text, and other modalities like audio)
  • Brainstorm with research mentors, review literature and existing solutions of a challenging real-world research problem
  • Develop novel solutions, implement prototypes, and perform extensive experiments to test the proposed solutions in meaningful benchmarks and metrics, analyze the results and verify the conclusions
  • Contribute to ongoing research projects and impactful technology releases
  • Draft and polish research publications
  • Present research outcomes to internal and/or external audiences
Read More
Arrow Right

AI Research Scientist (Technical Leadership), Data Research - MSL FAIR

Meta is seeking research scientists to help us build the data foundation for Met...
Location
Location
United States , Menlo Park
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 4+ years of industry research experience in NLP or CV
  • 4+ years as a formal technical lead experience
  • Experience leading major technical initiatives with cross-functional impact and influencing strategy across multiple teams
  • Practical experience with multimodal pre-training or mid-training data curation for large language models, media perception, or media generation models
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Architect efficient and scalable data curation systems and pipelines
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Research Scientist Intern, AI Research - World Models

Meta is seeking Research Interns to join the SAM team in the Multimedia Percepti...
Location
Location
United States , Menlo Park
Salary
Salary:
7650.00 - 12134.00 USD / Month
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining, a PhD degree in Computer Vision, Machine Learning, Artificial Intelligence, or relevant technical field
  • Research and/or work experience in Generative Modeling and Computer Vision. In particular: video generation, 3D/4D reconstruction, video and image understanding, vision-language foundation models, representation learning, and related areas
  • Research and/or work experience in Machine Learning or Deep Learning with applications to perception
  • Experience in Python, C++, or other related languages
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Perform research to advance the science and technology of generative AI
  • Perform research that enables learning to predict and condition on multimodal data (video, 3D structures, primarily images, text, and other modalities like audio)
  • Brainstorm with research mentors, review literature and existing solutions of a challenging real-world research problem
  • Develop novel solutions, implement prototypes, and perform extensive experiments to test the proposed solutions in meaningful benchmarks and metrics, analyze the results and verify the conclusions
  • Contribute to ongoing research projects and impactful technology releases
  • Draft and polish research publications
  • Present research outcomes to internal and/or external audiences
Read More
Arrow Right
New

Senior Research Perception Engineer

As a Senior Perception Engineer - Spatial Understanding & Navigation, you will l...
Location
Location
United Kingdom , London
Salary
Salary:
80000.00 - 150000.00 GBP / Year
https://www.randstad.com Logo
Randstad
Expiration Date
February 19, 2026
Flip Icon
Requirements
Requirements
  • Strong experience in machine learning for vision, robotics, or embodied AI
  • Deep expertise in scene understanding, spatial reasoning, or 3D perception
  • Hands-on experience with large models (VLMs, VLAs, transformers, diffusion, multimodal models)
  • Advanced PyTorch skills and experience deploying large-scale ML systems
  • Strong research and experimentation mindset - from concept to production
  • Comfortable working in a fast-paced, research-driven startup environment
Job Responsibility
Job Responsibility
  • Develop next-generation spatial understanding systems for robot locomotion and manipulation
  • Build open-ended navigation powered by Vision-Language-Action (VLA) models
  • Design large-scale data pipelines and auto-labelling systems for multimodal training
  • Implement scene understanding and 3D reconstruction for persistent spatial memory
  • Integrate large vision-language models into real-world robotic platforms
  • Evaluate new model architectures, datasets, and benchmarks to guide embodied AI strategy
  • Collaborate with robotics, research, and platform engineering teams
What we offer
What we offer
  • Competitive salary plus stock options
  • Relocation assistance
  • Paid annual leave and additional sick leave (aligned with local labour laws)
  • Travel opportunities to Vancouver and Boston offices
  • Free breakfasts, lunches, snacks, and regular team events
  • High ownership and real influence over product direction
  • Collaboration with world-class engineers and AI researchers
  • Startup culture focused on speed, transparency, and minimal bureaucracy
Read More
Arrow Right
New

Senior AI Research Scientist

Join Axon and be a Force for Good. At Axon, we’re on a mission to Protect Life. ...
Location
Location
Finland , Tampere
Salary
Salary:
Not provided
axon.com Logo
Axon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science or a related field with a focus on LLM, MLLMs, Computer Vision, GenAI
  • +5 years for ML Scientist, +8 years for Sr. ML Scientist, +10 years for Principal ML Scientist experience
  • Proven track record of research excellence in LLM, MLLM, Computer Vision, Robotics Perception, GenAI, demonstrated through publications in top-tier conferences or journals
  • Strong proficiency in programming languages such as Python, C/C++
  • Experience with deep learning frameworks such as TensorFlow, PyTorch, or Keras
  • Experience with ROS or robotic operational system
  • Drive one or more phases of the ML development lifecycle: shape datasets, investigate modeling approaches and architectures, train/evaluate/tune models and implement the end-to-end training pipeline
  • Leverage state-of-the-art research to deliver high quality models enabling multiple AI projects at scale
  • Contribute back to the research community via academic publications, tech blogs, open-source code and contributing to internal/external AI challenges
  • Experience in developing computer vision algorithms for resource-constrained devices such as mobile phones, IoT devices, or embedded systems is highly desirable
Job Responsibility
Job Responsibility
  • Own one or more key technical areas across LLM, MLLM, CV product portfolio
  • Provide technical leadership to junior scientists, guiding the transition of R&D concepts into impactful Axon product feature
  • Research and develop cutting-edge techniques in LLM, MLLMs, GenAI, and Computer Vision across cloud, devices and sensors based data sources
  • Design and implement efficient and scalable MLLM models for inference and analysis of multimodal data
  • Explore novel approaches to address challenges in NLP, NLU, Object Detection, Object Recognition, Object Tracking, Segmentation, and Scene Understanding
  • Optimize AI models, algorithms for performance, memory footprint, and energy efficiency to meet the requirements of resource-constrained devices
  • Join force with MLEs or firmware or hardware engineers to leverage hardware accelerators and optimize algorithms for specific hardware architectures
  • Evaluate the performance of LLM, MLLM, CV models using real-world datasets and design experiments to validate their effectiveness
  • Stay up-to-date with the latest research trends and advancements in computer vision, machine learning, and deep learning, MLLMs, GenAI and integrate relevant findings into our projects
  • Contribute to patent disclosures, academic publications, and technical documentation to share insights and findings with the broader community
What we offer
What we offer
  • Competitive salary and 401k with employer match
  • Discretionary paid time off
  • Paid parental leave for all
  • Medical, Dental, Vision plans
  • Fitness Programs
  • Emotional & Mental Wellness support
  • Learning & Development programs
  • Snacks in our offices
  • Fulltime
Read More
Arrow Right
New

AI Research Scientist, Media Data Research - MSL FAIR

Meta is seeking AI research scientists to help us build the data foundation for ...
Location
Location
United States , Menlo Park
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 1+ year of industry research experience in LLM/LMM, computer vision, or related AI/ML models
  • Experience owning and/or driving complex technical projects from end-to-end
  • Practical experience with multimodal pre-training or mid-training data curation for large media perception or generation models
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Fundamentally improve our data velocity across workflows and projects by contributing to quality in data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, data scaling laws, or data mixing
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Scientist, Media Data Research

Meta is seeking AI research scientists to help us build the data foundation for ...
Location
Location
United States , Menlo Park
Salary
Salary:
184000.00 - 257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 2+ years of industry research experience in LLM/NLP, computer vision, or related AI/ML models
  • Experience as a formal technical lead, leading major technical initiatives with cross-functional impact, and/or influencing strategy across multiple teams
  • Practical experience with multimodal pre-training or mid-training data curation for large media perception or generation models
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, data scaling laws, or data mixing
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right