AI Research Scientist (Technical Leadership), Data Research

Job Description

Meta is seeking research scientists to help us build the data foundation for Meta's most advanced Large Language and Media Models. We're looking for researchers with LLM expertise to join us on working with data at scale and to push beyond the data ceiling. Our team contributes to data curation across all stages of LLM development (pre-training, mid-training, post-training) and all domains/modalities (e.g., web, code, image, video, multilingual). We tackle the hardest challenges at trillion-scale, including organic data curation, synthetic data generation, agent and interaction data, and frontier paradigms that redefine what's possible. Based in Meta Superintelligence Labs (MSL) within the Fundamental AI Research Organization (FAIR), you'll directly contribute to Meta’s frontier models like Llama, while having the chance to collaborate with researchers and engineers across MSL.

Job Responsibility

Collaborate with cross-functional teams to develop Meta’s next foundational models
Advance our understanding of data research, such as how to overcome data walls and how best to create synthetic data
Architect efficient and scalable data curation systems and pipelines
Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
Execute on high priority projects in pre-training, mid-training, or post-training data curation
Apply specialized expertise in video/image generation, video/image perception, OCR, agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
Lead complex technical projects end-to-end

Requirements

Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
PhD in Computer Science or a related technical field
4+ years of industry research experience in NLP or CV
4+ years as a formal technical lead experience
Experience leading major technical initiatives with cross-functional impact and influencing strategy across multiple teams
Practical experience with multimodal pre-training or mid-training data curation for large language models, media perception, or media generation models
Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI

Nice to have

Experience working on frontier-quality/ state-of-the-art Large Language or Large Media Models
First-author publications at top peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV)
Programming experience in Python and hands-on experience with frameworks like PyTorch or Spark, or related distributed computing frameworks (Ray, DataFlow)
Hands-on experience on SQL and large-scale data handling, with familiarity of frameworks like Spark and Hive

What we offer

bonus
equity
benefits

Meta - All Job Offers

Select Country

AI Research Scientist (Technical Leadership), Data Research - MSL FAIR

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

AI Research Scientist (Technical Leadership), Data Research - MSL FAIR

Customer Assistant II

Care Assistant - Days

DPW Highway Equipment Operator

Grant Accountant

Hgv Class 1 Driver

Financial Advisor

Director of Parks and Recreation

Fire Risk Assessor

Our AI answers in your language