CrawlJobs Logo

Research Scientist / Engineer – Multimodal Capabilities

lumalabs.ai Logo

Luma AI

Location Icon

Location:
United States , Palo Alto

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

187500.00 - 395000.00 USD / Year

Job Description:

This is a high-impact opportunity to define the future of what our models can do. As a first-principles researcher, you will tackle the most ambitious questions at the heart of our mission: how can the fusion of vision, audio, and language unlock entirely new, magical behaviors in Al? You will not just be improving existing systems, you will be charting the course for the next generation of model capabilities, designing the core experiments that will shape the future of our technology and products.

Job Responsibility:

  • Research and Define the next frontier of multimodal capabilities, identifying key gaps in our current models and designing the experiments to solve them
  • Design and Execute novel experiments, datasets, and methodologies to systematically improve model performance across vision, audio, and language
  • Develop and Pioneer new evaluation frameworks and benchmarking approaches to precisely measure novel multimodal behaviors and capabilities
  • Collaborate Deeply with other research teams to translate your findings into our core training recipes and unlock new product experiences
  • Build and Prototype compelling demonstrations that showcase the groundbreaking multimodal capabilities you have unlocked

Requirements:

  • PhD or equivalent research experience in a field related to AI, Machine Learning, or Computer Science
  • Strong programming skills in Python and deep, hands-on experience with PyTorch
  • Proven track record of working with multimodal data pipelines and curating large-scale datasets for research
  • Deep, fundamental understanding of at least one of the core modalities: computer vision, audio processing, or natural language processing
  • Thrive on tackling the most ambitious, open-ended research challenges in a fast-paced, collaborative environment

Nice to have:

  • Direct expertise working with complex, interleaved multimodal data (video, audio, text)
  • Hands-on experience training or fine-tuning Vision Language Models (VLMs), Audio Language Models, or large-scale generative video models from scratch
  • A strong publication record in top-tier AI conferences (e.g., NeurIPS, ICML, CVPR, ICLR)
  • Experience leading ambitious, open-ended research projects from ideation to tangible results

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Scientist / Engineer – Multimodal Capabilities

Sr. Applied Research Scientist

We’re looking for a Sr. Applied Research Scientist to lead efforts in building l...
Location
Location
United States
Salary
Salary:
280000.00 - 380000.00 USD / Year
runwayml.com Logo
Runway
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of relevant ML engineering or research experience in language models
  • Very strong programming skills and ability to write clean and maintainable research code
  • Deep interest in building human-in-the-loop systems for creativity
  • Passion for seeing research through from initial conception to eventual application
  • Experience mentoring and teaching other researchers
  • Strong communication, collaboration, and documentation skills
Job Responsibility
Job Responsibility
  • Lead efforts in building large language models and vision language models that power Runway’s research and tools, with a focus on multimodal capabilities and reasoning
  • Fulltime
Read More
Arrow Right

Research Scientist - Generative AI

This role involves conducting high-quality research in generative AI, designing ...
Location
Location
United States
Salary
Salary:
101900.00 - 234500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Artificial Intelligence, Machine Learning, Physics, Mathematics, or other related fields
  • 3-5 years working experience with training and fine-tuning generative AI models including LLMs, diffusion models, or Energy-Based Models
  • Proven track record of research in generative models, demonstrated through publications, patents, or publicly available projects
  • Proficiency in programming languages commonly used in AI research, such as Python, and experience with AI/ML frameworks (e.g., TensorFlow, PyTorch)
  • Deep understanding of machine learning algorithms and principles, especially in the context of generative AI
  • Strong mathematical background, with excellent skills in areas such as statistics, probability, linear algebra
  • Creative and analytical thinking abilities, with a passion for solving complex problems
  • Excellent communication skills, capable of conveying complex ideas clearly and engaging with both technical and non-technical audiences
Job Responsibility
Job Responsibility
  • Conduct high-quality research in generative AI, including but not limited to designing algorithms for pre-training and post-training current autoregressive and diffusion models for multimodal data
  • Design, implement, and validate new algorithms and models for augmented LLMs, pushing the boundaries of AI capabilities
  • Developing and prototyping novel algorithms for fine-turning, retrieval augmented generation, and in-context learning for various generative models
  • Developing algorithms for training and inference in Energy-Based Models
  • Collaborate with cross-functional teams to apply research findings to develop new products or enhance existing ones
  • Publish research papers in top-tier journals and conferences, sharing findings with the broader scientific community
  • Stay abreast of the latest AI research and trends, identifying opportunities for innovation and improvement
  • Mentor junior researchers and engineers, fostering a culture of knowledge sharing and collaboration
  • Develop prototypes and proof-of-concept implementations to demonstrate the potential of research findings
  • Engage with the academic community by attending conferences, workshops, and seminars
What we offer
What we offer
  • A competitive salary and extensive social benefits
  • Diverse and dynamic work environment
  • Work-life balance and support for career development
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Research Scientist - Generative AI

As a Research Scientist in the Emergent Machine Intelligence Team at Hewlett Pac...
Location
Location
United States , Santa Barbara
Salary
Salary:
101900.00 - 234500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Artificial Intelligence, Machine Learning, Physics, Mathematics, or other related fields
  • 3-5 years working experience with training and fine-tuning generative AI models including LLMs, diffusion models, or Energy-Based Models
  • Proven track record of research in generative models, demonstrated through publications, patents, or publicly available projects
  • Proficiency in programming languages commonly used in AI research, such as Python, and experience with AI/ML frameworks (e.g., TensorFlow, PyTorch)
  • Deep understanding of machine learning algorithms and principles, especially in the context of generative AI
  • Strong mathematical background, with excellent skills in areas such as statistics, probability, linear algebra
  • Creative and analytical thinking abilities, with a passion for solving complex problems
  • Excellent communication skills, capable of conveying complex ideas clearly and engaging with both technical and non-technical audiences.
Job Responsibility
Job Responsibility
  • Conduct high-quality research in generative AI, including but not limited to designing algorithms for pre-training and post-training current autoregressive and diffusion models for multimodal data
  • Design, implement, and validate new algorithms and models for augmented LLMs, pushing the boundaries of AI capabilities
  • Developing and prototyping novel algorithms for fine-tuning, retrieval augmented generation, and in-context learning for various generative models
  • Developing algorithms for training and inference in Energy-Based Models
  • Collaborate with cross-functional teams to apply research findings to develop new products or enhance existing ones
  • Publish research papers in top-tier journals and conferences, sharing findings with the broader scientific community
  • Stay abreast of the latest AI research and trends, identifying opportunities for innovation and improvement
  • Mentor junior researchers and engineers, fostering a culture of knowledge sharing and collaboration
  • Develop prototypes and proof-of-concept implementations to demonstrate the potential of research findings
  • Engage with the academic community by attending conferences, workshops, and seminars.
What we offer
What we offer
  • A competitive salary and extensive social benefits
  • Diverse and dynamic work environment
  • Work-life balance and support for career development.
  • Fulltime
Read More
Arrow Right

Tech Lead Manager - Behaviour Learning for Embodied AI

The Science organisation at Wayve advances foundational research in embodied AI ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Years of experience in applied ML/AI roles with strong hands-on contributions
  • Demonstrated track record of impactful technical work in one or more of: multimodal learning, reinforcement learning, generative models, latent action modelling, optimisation, or planning
  • Experience building large-scale ML infrastructure and working with high-dimensional temporal data (e.g., video, multi-sensor inputs)
  • Deep understanding of the end-to-end lifecycle of ML research and deployment
  • Strong Python and PyTorch engineering fundamentals, with experience developing research-grade, production-oriented tools
  • Proven ability to shape technical strategy and lead architectural design for ML systems
  • Publications at top-tier ML conferences such as NeurIPS, ICML, CoRL or ICLR
  • Clear and thoughtful communicator, capable of influencing technical direction and mentoring others without formal reporting lines
Job Responsibility
Job Responsibility
  • Architect the future – Design and evolve models for efficient, robust, and adaptable autonomy, setting a high technical bar for quality and innovation
  • Accelerate research impact – Partner with team members to test, scale, and productionise research ideas - from architecture design to data strategy. Provide technical guidance and feedback on research design, implementation, and evaluation. Implement scalable, high-throughput training pipelines for models with temporal context and develop and evaluate novel data sampling strategies to accelerate training and generalisation
  • Get hands-on when it matters – Lead from the front by contributing directly to key system components, codebases, and experiments, especially during high-leverage moments. Contribute directly as an IC on core research and development tasks (~60-70% of time)
  • Disrupt thoughtfully – Challenge assumptions, ask sharp questions, and champion bold ideas that push us beyond incremental gains and toward breakthrough advances
  • Make things happen – Lead a high-performing, cross-functional team of applied scientists and ML engineers working across ML, RL, representation learning, planning, among many more. Work closely with the team manager to drive quarterly planning and execution of research-engineering initiatives, enabling rapid iteration and delivery in high-ambiguity environments. Translate ambiguity into action and ensure technical progress tracks with our mission
  • Champion change – Lead through ambiguity. Balance structure and adaptability to help your team navigate evolving priorities, novel research, and complex organisational change
Read More
Arrow Right

Staff Machine Learning Engineer

We are seeking a Staff Machine Learning Engineer to join our Foundation AI team....
Location
Location
United States , Boston
Salary
Salary:
170000.00 - 230000.00 USD / Year
whoop.com Logo
Whoop
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree (Master’s or Ph.D.) in Computer Science, Machine Learning, Electrical Engineering, or a related field, or equivalent professional experience
  • 7+ years of experience in applied ML, AI research, or large-scale modeling, with a track record of delivering production systems
  • Expertise in modern deep learning (e.g., transformers, state space models) and multimodal model training
  • Proficiency in Python and deep learning frameworks (e.g., PyTorch, TensorFlow)
  • Experience building and scaling large datasets and training large models in distributed compute environments
  • Strong applied experience with representation learning, self-supervised methods, and fine-tuning for downstream applications
  • Familiarity with MLOps best practices including model versioning, evaluation, CI/CD for ML, and cloud-based compute
  • Excellent communication skills and ability to collaborate cross-functionally with engineers, researchers, and product teams
  • Passion for WHOOP’s mission to improve human performance and extend healthspan through science and technology
Job Responsibility
Job Responsibility
  • Design, train, and optimize large-scale multimodal foundation models that integrate wearable sensor data, text, biomarkers, and behavioral data
  • Conduct applied research in self-supervised learning, representation learning, and downstream task fine tuning to advance WHOOP’s core model capabilities
  • Develop scalable, distributed training pipelines for large models on high-performance compute environments
  • Collaborate with MLOps, data engineering, and software engineering teams to operationalize models for production deployment, ensuring robustness, reproducibility, and observability
  • Partner with product and research teams to translate foundation model capabilities into downstream features that deliver meaningful member value
  • Contribute to the technical roadmap and architectural direction for foundation model development at WHOOP
  • Serve as a technical mentor for other data scientists, sharing best practices in deep learning, large-scale training, and multimodal data integration
  • Ensure models adhere to WHOOP’s standards for ethical, transparent, and privacy-preserving AI
What we offer
What we offer
  • competitive base salaries
  • meaningful equity
  • benefits
  • generous equity package
  • Fulltime
Read More
Arrow Right
New

Research Scientist / Engineer – Realtime Interactive

At Luma, the Realtime Interactive team is responsible for building an entirely n...
Location
Location
United States , Palo Alto
Salary
Salary:
187500.00 - 395000.00 USD / Year
lumalabs.ai Logo
Luma AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with fine-tuning large-scale generative models
  • Proficiency in PyTorch and distributed training frameworks
  • (Preferred) Strong background in methods for optimizing model inference (distillation, quantization, sparsity, compression, etc.)
  • (Preferred) Experience in gathering, processing, and annotating datasets
Job Responsibility
Job Responsibility
  • Work on top of pretrained multimodal generative models to fine-tune and optimize them for realtime generation
  • Design novel algorithms and techniques to solve problems with autoregressive visual generation, long-range temporal consistency, and long-term memory
  • Develop interactive applications with tight latency constraints
  • Process data to develop advanced interactive capabilities and controls for World Modeling, such as controlling character and camera movement, audio, and more
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, Embodied Foundation Models

Our team is seeking a talented Applied Scientist Intern to join us for 3-6 month...
Location
Location
United States , Sunnyvale
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently pursuing a graduate degree in Computer Science, Machine Learning, Robotics, or related technical field
  • Proficient in at least one backend/systems programming language (e.g. Python, Ruby, Java, etc)
  • Previous experience in vision-language models, large language models, natural language processing, especially around reasoning
  • Solid software engineering fundamentals, especially in Python
  • Previously used PyTorch or a similar library for deep learning (e.g. Tensorflow, JAX)
  • Experience with multi-node distributed training of large models
  • Interested in using large-scale multimodal (vision, language, etc.) datasets to improve embodied AI
  • Previous publications in conferences (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)
Job Responsibility
Job Responsibility
  • Work on foundation models for embodied AI, including large-scale pretraining, post-training, leveraging language, or improving reasoning capabilities
  • Train models on large-scale multimodal (vision, language, etc.) data efficiently in a multi-node distributed system, and evaluate their performance on open (and closed) datasets/benchmarks
  • Lead a high-impact research work and publish at a top tier conference
Read More
Arrow Right

Research Scientist Intern, Embodied Foundation Models (Evaluation)

Our team is seeking a talented Applied Scientist Intern to join us for 3-6 month...
Location
Location
United States , Sunnyvale
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • You are currently pursuing a graduate degree in a Computer Science, Machine Learning, Robotics, or related technical field
  • You are proficient in at least one backend/systems programming language (e.g. Python, Ruby, Java, etc)
  • You have previous experience in vision-language models, large language models, natural language processing, especially around reasoning
  • You have solid software engineering fundamentals, especially in Python
  • You have previously used PyTorch or a similar library for deep learning (e.g. Tensorflow, JAX)
  • Experience with multi-node distributed training of large models
  • You are interested in using large-scale multimodal (vision, language, etc.) datasets to improve embodied AI
  • You have previous publications in the following conferences (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)
Job Responsibility
Job Responsibility
  • Work on foundation models for embodied AI, including large-scale pretraining, post-training, leveraging language, or improving reasoning capabilities
  • Train models on large-scale multimodal (vision, language, etc.) data efficiently in a multi-node distributed system, and evaluate their performance on open (and closed) datasets/benchmarks
  • Lead a high-impact research work and publish at a top tier conference (e.g., CVPR, ICCV, CoRL, NeurIPS, CoLM, RSS, ICRA, among others)
Read More
Arrow Right