CrawlJobs Logo

Research Scientist / Engineer – Pre-training / Scaling

United States, Palo Alto 187500.00 - 395000.00 USD / Year · Job Posted January 13, 2026
Apply Position
Job Link Share

Job Description

At Luma, the Pre-Training / Scaling team is responsible for building the core multimodal AI systems that power our entire platform. Working at the forefront of generative AI research, this team develops the fundamental architectures and training methodologies that enable our models to see, hear, understand, and interact with the world across video, image, text, and audio modalities.

Job Responsibility

  • Lead cutting-edge research in multimodal foundation models spanning video, image, text, and audio
  • Design and implement novel algorithms, architectures, and techniques for large-scale generative AI models
  • Develop training methodologies for foundation models across thousands of GPUs
  • Research and implement state-of-the-art techniques in Autoregressive LLMs, Vision Language Models, and / or Diffusion Models
  • Collaborate with cross-functional teams to transition research into production systems

Requirements

  • Expertise in Python and PyTorch with experience building ML models from scratch
  • Deep understanding of multimodal generative models and deep learning architectures
  • (Preferred) Strong research track record in generative AI with published work in top-tier venues preferred
  • (Preferred) Experience with large-scale distributed training systems

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Research Scientist / Engineer – Pre-training / Scaling

8 matching positions

Principal/Senior Applied Scientist Security Models Training Team - Next-Gen frontier research

The Security Models Training team is expanding to drive the development of a new...
Location
Location
Israel , Tel Aviv, Herzliya
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • M.Sc. / Ph.D. in Computer Science, Information Systems, Electrical or Computer Engineering or Data Science (Ph.D. strongly preferred)
  • Candidates with M.Sc. / Ph.D. in related fields with proven industry experience or a strong publication record in the areas of LLM, Information Retrieval, Machine Learning, Natural Language Processing, Time Series Forecasting and Deep Learning are considered as well
  • Proven hands-on experience of at least 5 years (including post-grad work) in building and deploying Machine Learning products
  • Key areas of expertise include Natural Language Processing and Large Language Models, along with an understanding of concepts such as Privacy and Responsible AI
  • Candidates are expected to demonstrate a strong history of successfully translating applied research into production-ready solutions, along with a proven track record of delivering projects within large-scale production environments
  • Proven expertise in the LLM and/or time-series forecasting domain, demonstrating comprehensive knowledge of relevant concepts in the domain
  • Ideal applicants should be proficient in areas such as LLM’s pre and post training, including CPT, SFT and RL, LLM benchmarking, agentic flows, and model alignment
  • Hands-on experience in building neural model architectures at the 100M+ scale and the proficiency to adapt them at all abstraction levels down the individual block (e.g. changing the innerworkings of an attention block, introducing new blocks, or changing the routings)
  • Demonstrated proficiency in problem-solving and data analysis, with substantial expertise in evaluating the performance of large language models (LLMs) and/or time-series forecasting models, developing benchmarks tailored to practical scenarios
Job Responsibility
Job Responsibility
  • Technical Leadership & Ownership: set technical direction for major security domain initiatives
  • lead security model programs spanning pre‑training, task tuning, reinforcement learning, and evaluation
  • translate cutting‑edge research into production‑ready capabilities
  • Advanced Model Design – Building and customizing deep learning model architectures (e.g., modifying transformer blocks, attention/memory modules, etc.) at the SLM/LLM scale
  • making principled architectural tradeoffs to improve reliability, robustness, and security‑specific behavior
  • Advanced Model Training – Apply deep expertise in pre-training, post-training, and reinforcement learning (RL) for both language and other modalities, including time-series
  • Design & Evaluate Datasets – Build high-quality datasets and benchmarks
  • define objective evaluation frameworks and quality gates
  • run ablation studies to measure impact and optimize data and training effectiveness to support confident product decisions
  • Develop Data Infrastructure – Create and maintain scalable pipelines for ingestion, preprocessing, filtering, and annotation of large, complex datasets, with attention to privacy, governance, and long‑term reuse across security scenarios
  • Fulltime
Read More
Arrow Right

AI Research Scientist, Media Data Research - MSL FAIR

Meta is seeking AI research scientists to help us build the data foundation for ...
Location
Location
United States , Menlo Park
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 1+ year of industry research experience in LLM/LMM, computer vision, or related AI/ML models
  • Experience owning and/or driving complex technical projects from end-to-end
  • Practical experience with multimodal pre-training or mid-training data curation for large media perception or generation models
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Fundamentally improve our data velocity across workflows and projects by contributing to quality in data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, data scaling laws, or data mixing
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Scientist, Text Data Research - MSL FAIR

Meta is seeking AI research scientists to help us build the data foundation for ...
Location
Location
United States , Menlo Park
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 1+ year of industry research experience in LLM/NLP or related AI/ML models
  • Experience owning and/or driving complex technical projects from end-to-end
  • Practical experience with pre-training or mid-training data curation for large foundational models and experience working with organic, synthetic, agentic, or reasoning data for LLMs
  • Published research in leading peer-reviewed conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Architect efficient and scalable data curation systems and pipelines
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Scientist, Media Data Research - MSL FAIR

Meta is seeking AI research scientists to help us build the data foundation for ...
Location
Location
United States , Bellevue
Salary
Salary:
122000.00 - 181000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • PhD in Computer Science or a related technical field, plus 1+ years of industry research experience in LLM/LMM, computer vision, or related AI/ML models
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
  • Experience owning and/or driving complex technical projects from end-to-end
  • Practical experience with multimodal pre-training or mid-training data curation for large media perception or generation models
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Fundamentally improve our data velocity across workflows and projects by contributing to innovation in data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, data scaling laws, or data mixing
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Scientist, Text Data Research - MSL FAIR

Meta is seeking AI research scientists to help us build the data foundation for ...
Location
Location
United States , Menlo Park
Salary
Salary:
184000.00 - 257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 2+ years of industry research experience in LLM/NLP or related AI/ML models
  • Experience as a formal technical lead, leading major technical initiatives with cross-functional impact, and/or influencing strategy across multiple teams
  • Practical experience with pre-training or mid-training data curation for large foundational models and experience working with organic, synthetic, agentic, or reasoning data for LLMs
  • Published research in leading peer-reviewed conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Architect efficient and scalable data curation systems and pipelines
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Scientist, Media Data Research

Meta is seeking AI research scientists to help us build the data foundation for ...
Location
Location
United States , Menlo Park
Salary
Salary:
184000.00 - 257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 2+ years of industry research experience in LLM/NLP, computer vision, or related AI/ML models
  • Experience as a formal technical lead, leading major technical initiatives with cross-functional impact, and/or influencing strategy across multiple teams
  • Practical experience with multimodal pre-training or mid-training data curation for large media perception or generation models
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, data scaling laws, or data mixing
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

AI Research Scientist (Technical Leadership), Data Research - MSL FAIR

Meta is seeking research scientists to help us build the data foundation for Met...
Location
Location
United States , Menlo Park
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD in Computer Science or a related technical field
  • 4+ years of industry research experience in NLP or CV
  • 4+ years as a formal technical lead experience
  • Experience leading major technical initiatives with cross-functional impact and influencing strategy across multiple teams
  • Practical experience with multimodal pre-training or mid-training data curation for large language models, media perception, or media generation models
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
  • Architect efficient and scalable data curation systems and pipelines
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Staff Research Engineer, MetaAI Assistant Measurement

Meta Superintelligence Labs is seeking a Staff Research Engineer to provide tech...
Location
Location
United States , Bellevue
Salary
Salary:
257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Track record of driving technical strategy and landing large-scale research or product impacts in a time-sensitive environment
  • Proven technical vision regarding the future trajectory of Generative AI, specifically in how model performance translates to user utility
  • Expertise in designing and implementing online and offline measurement systems, benchmark building, and data synthesis techniques
  • Experience leading complex, cross-functional technical initiatives, driving consensus across engineering, research, and product boundaries
  • Proficiency in Python and deep learning frameworks (e.g., PyTorch), with the ability to prototype and implement complex methodologies
Job Responsibility
Job Responsibility
  • Architect Scientific Strategy: Define and lead the execution of the scientific roadmap for AI Assistant measurement, ensuring methodologies are rigorous, scalable, and aligned with product goals
  • Innovate & Build: Spearhead the research and development of novel offline and online evaluation metrics, automated benchmarks, and synthetic data generation pipelines to close the loop between model training and deployment
  • Cross-Functional Technical Leadership: Serve as the primary scientific liaison to pre-training, post-training, and product teams, ensuring that measurement insights directly influence model architecture and training recipes (the "evaluation flywheel")
  • Mentorship & Influence: Provide technical mentorship and guidance to senior research engineers and applied scientists, fostering a culture of scientific rigor and code without direct management responsibilities
  • Hands-on Contribution: Remain hands-on in code and research, building prototypes for new evaluation frameworks and validating novel measurement theories
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right