CrawlJobs Logo

AI Research Scientist, Evaluations - Meta Superintelligence Lab

meta.com Logo

Meta

Location Icon

Location:
United States , Menlo Park

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

184000.00 - 257000.00 USD / Year

Job Description:

Meta is seeking Research Scientists to join the Evaluations team within Meta Superintelligence Labs (MSL). Evaluations are the core of AI progress at MSL, determining what capabilities get built, which features get prioritized, and how fast our models improve. As a Research Scientist, you will provide the technical capabilities to measure and understand the capabilities of our frontier AI systems. You'll work in tandem with world-class researchers to envision, develop, and validate novel evaluations that shape the future of AI capability measurement. This is a technical research role requiring good scientific judgment, creativity, and the ability to drive ambitious research agendas with independence. The evaluations you develop will directly influence research direction and major model lines within MSL, making scientific validity, methodological rigor, and clear communication important. You will collaborate closely with technical leadership to ensure evaluations capture the most important capabilities, translating organizational priorities into measurable benchmarks, and translating evaluation insights back into research direction. We are looking for exceptional research talent – researchers who have shaped the field of machine learning, and are ready to do so again at the frontier of AI. If you are passionate about defining how we measure AI progress and want to shape the scientific foundations of frontier AI development, we encourage you to apply for this exciting opportunity at the core of MSL.

Job Responsibility:

  • Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development
  • Develop and implement evaluation environments, including environments for novel model capabilities and modalities
  • Collaborate with external data vendors to source and prepare high-quality evaluation datasets
  • Execute on the technical vision of research scientists designing new benchmarks and evaluations
  • Build robust, reusable evaluation pipelines that scale across multiple model lines and product areas
  • Contribute to evaluation tooling that measures the quality and reliability of evaluation suites

Requirements:

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD degree in Computer Science, Machine Learning, or a related technical field
  • 3+ years of experience in machine learning engineering, machine learning research, or a related technical role
  • Proficiency in Python and experience with ML frameworks such as PyTorch
  • Experience identifying, designing and completing medium to large technical features independently, without guidance
  • Proven success in software engineering practices including version control, testing, and code review practices
  • Ability to work independently and adapt to rapidly changing priorities

Nice to have:

  • Publications at peer-reviewed venues (NeurIPS, ICML, ICLR, ACL, EMNLP, or similar) related to language model evaluation, benchmarking, or deep learning
  • Hands-on experience with language model post-training and deep learning systems, or building reinforcement learning environments
  • Experience implementing or developing evaluation benchmarks for large language models and multimodal models (e.g., vision-language, audio, video)
  • Experience working with large-scale distributed systems and data pipelines
  • Familiarity with language model evaluation frameworks and metrics
  • Track record of open-source contributions to ML evaluation tools or benchmarks
What we offer:
  • bonus
  • equity
  • benefits

Additional Information:

Job Posted:
February 20, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for AI Research Scientist, Evaluations - Meta Superintelligence Lab

AI Research Scientist - Voice AI Team, Meta Superintelligence Labs

Meta is seeking AI Research Scientists to join the Realtime AI Voice team in Met...
Location
Location
United States , Menlo Park, CA +2 locations
Salary
Salary:
184000.00 - 257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • PhD degree in Computer Science, Mathematics, or similar quantitative field
  • 2+ years of post-PhD experience in an academic, industry, or government laboratory setting, with primary responsibilities focused on AI research
  • Proven track record of publications at peer-reviewed AI & speech conferences (e.g. NeurIPS, ICML, ICLR, ICASSP)
  • Experience in training, fine-tuning, and/or experimenting with foundation models beyond black-box use
  • Familiarity with one or more deep learning frameworks (e.g., pytorch, tensorflow)
  • Experience communicating complex research to public audiences of peers
Job Responsibility
Job Responsibility
  • Lead, collaborate, and execute on research that pushes forward the state of the art in speech and large language model research
  • Directly contribute to experiments, including designing experimental details, develop reusable code, running evaluations, and organizing results
  • Help identify long-term research goals as well as intermediate milestones
  • Work cross-functionally to translate research breakthroughs into scalable, production-ready solutions for Meta's conversational AI / product experiences
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Scientist, Personalization, Meta SuperIntelligence Labs

Meta is seeking AI research scientists to help us build the solutions for Person...
Location
Location
United States , Menlo Park
Salary
Salary:
154000.00 - 217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Phd in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Experience in Generative AI models and building LLM technologies particularly post training
  • Experience solving complex problems and comparing alternative solutions, tradeoffs, and different perspectives to determine a path forward. Proven experience of proactively identifying, scoping and implementing innovative research solutions
  • Programming experience in Python and hands-on experience with frameworks like Pytorch, Spark
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop and improve personalization in Meta’s frontier foundation models
  • Directly contribute to experiments, including designing experimental details, authoring reusable code, running evaluations, and organizing results
  • Prioritize research that can be applied to Meta's product development
  • Lead complex research projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Research Engineering Manager, Evaluations, Meta Superintelligence Labs

Meta is seeking a Research Engineering Manager to lead the Evaluations team with...
Location
Location
United States , Menlo Park
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Machine Learning, or a related technical field
  • 4+ years of experience in machine learning engineering, machine learning research, or a related technical role
  • 3+ years of experience managing or leading technical teams, including hiring, mentoring, and performance management
  • Proficiency in Python and experience with ML frameworks such as PyTorch
  • Proven track record of leading medium to large-scale technical projects from conception to deployment
  • Demonstrated experience balancing hands-on technical work with people management and strategic planning
  • Clear communication and experience influencing cross-functional stakeholders
Job Responsibility
Job Responsibility
  • Build, mentor, and grow a team of research engineers and scientists focused on evaluation infrastructure and benchmarking
  • Conduct performance reviews, career development conversations, and provide technical mentorship to team members
  • Foster a culture of engineering excellence, research rigor, and rapid iteration within the team
  • Partner with recruiting to hire world-class research engineering talent
  • Curate and integrate publicly available and internal benchmarks to direct the capabilities of frontier model development
  • Oversee the development and implementation of evaluation environments, including environments for novel model capabilities and modalities
  • Establish partnerships with external data vendors to source and prepare high-quality evaluation datasets
  • Influence the technical roadmap for evaluation infrastructure in collaboration with MSL Infra team
  • Translate the technical vision of research scientists into actionable engineering plans and execution strategies
  • Partner with research scientists, product teams, and other engineering teams to align evaluation priorities with organizational goals
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Data Scientist, Evaluations - Meta Superintelligence Labs

Meta is seeking a Data Scientist to join the Evaluations team within Meta Superi...
Location
Location
United States , Menlo Park
Salary
Salary:
177000.00 - 247000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Bachelor's degree in Mathematics, Statistics, a relevant technical field, or equivalent practical experience
  • A minimum of 6 years of work experience in analytics (minimum of 4 years with a Ph.D.)
  • Experience with data querying languages (e.g. SQL), scripting languages (e.g. Python), and/or statistical/mathematical software (e.g. R)
Job Responsibility
Job Responsibility
  • Scientific Design & Validity: Lead the design of evaluation stimuli and benchmarks, ensuring they have minimal bias and high construct validity for frontier LLM capabilities
  • Experimental Methodology: Design and execute effective sampling strategies and experimental frameworks to measure model performance and errors accurately
  • Deep-Dive Analysis: Perform rigorous data and model error analyses to provide deep insights into model behavior, quality gaps, and failure modes
  • Collaborative Research: Partner closely with Research Scientists and Engineers to translate organizational priorities into measurable, scientifically sound benchmarks
  • External Impact: Drive the publication of novel evaluation research and the open-sourcing of benchmarks to influence the broader AI research community
  • Strategic Influence: Use data-driven insights to influence research directions and major model development lines within MSL
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right
New

Project Manager - Logistics Systems

At Booker, part of the Tesco Group, we’re proud to support independent retailers...
Location
Location
United Kingdom , Wellingborough
Salary
Salary:
Not provided
bookergroupjobs.co.uk Logo
Booker Group
Expiration Date
May 29, 2026
Flip Icon
Requirements
Requirements
  • Proven experience delivering large-scale technology or systems projects
  • Strong project management skills, ideally with experience of Agile and/or Prince2 methodologies
  • Experience managing stakeholders and influencing across multiple teams, including senior leaders
  • Strong organisational skills, with the ability to manage priorities, risks and timelines effectively
  • Confidence working with both technical and non-technical teams
  • Strong communication skills, with the ability to provide clear updates and documentation
  • Experience in project reporting, governance and financial forecasting
  • A proactive, solutions-focused mindset with the ability to manage change
Job Responsibility
Job Responsibility
  • Lead the end-to-end delivery of complex technology projects across logistics systems
  • Create and manage project plans, including timelines, milestones and technical delivery activities
  • Ensure governance processes are followed to deliver projects on time and within budget
  • Engage stakeholders across the business, providing clear updates on progress, risks and issues
  • Identify and manage project risks, issues and dependencies, taking proactive steps to resolve them
  • Facilitate discussions, manage escalations and keep delivery on track
  • Build strong relationships with internal teams and external partners, ensuring accountability and collaboration
  • Support system improvements that drive business performance and customer satisfaction
  • Produce clear, accurate project documentation and reporting, including financial tracking
  • Help organise training and guidance for end users as part of successful project delivery
What we offer
What we offer
  • A Booker colleague card with 10% off purchases at Booker and double discount events up to three times a year
  • After 3 months service, a Tesco colleague discount card with 10% increasing to 15% off most purchases at Tesco for a 4 day period after every four-weekly pay day, ie. thirteen times a year. In addition to 10% off at Tesco Cafe and 20% off all F&F purchases
  • 10% off pay monthly & SIM only deals with Tesco Mobile for yourself, along with further great deals through-out the year
  • Up to 30% off car, pet and home insurance at Tesco bank. Terms and conditions apply
  • Free eye test when you spend £50 or more. You can also save 30% when you spend £50 or more on glasses, prescription sunglasses and contact lenses
  • 50% off health checks at Tesco Pharmacy
  • Exclusive access to discounted RAC breakdown cover rates
  • An exclusive deals and discounts website saving you money on everyday purchases including a cycle to work scheme
  • After 3 months service, you can join our annual Save As You Earn share scheme which allows you to buy Tesco shares in the future at a discount
  • Retirement savings plan (pension) - save up to 5% and Booker will match your contribution
  • Fulltime
Read More
Arrow Right
New

Finance Director-GPO LATAM

The Finance Leader will be responsible for driving the financial strategy of GPO...
Location
Location
United States , Miami
Salary
Salary:
Not provided
aramark.com Logo
Aramark
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience in regional or multinational financial leadership roles
  • Deep knowledge of budgeting, forecasting, and financial control
  • Ability to lead complex projects across multiple countries
  • Fluency in Spanish and English (preferred)
  • Requires a bachelor’s degree or equivalent experience in accounting/finance
  • CPA or MBA strongly preferred
Job Responsibility
Job Responsibility
  • Develop and monitor budgets across all countries in the region
  • Generate financial forecasts and profitability analyses
  • Control costs and revenues to ensure operational efficiency
  • Lead the finance Supply Chain for Managed Services in Mexico
  • Manage an expense budget of approximately USD 700M across contracts, with margins of USD 8M in GPO and USD 20M in NVDs in MS
  • Coordinate and lead regional initiatives to drive sales and increase market penetration
  • Ensure successful execution of financial and commercial projects across multiple countries
  • Responsible for implementing a new operating model
  • Ensure that financial and management systems effectively support the new model
  • Act as a key partner to the VP GPO LATAM in decision-making
  • Fulltime
Read More
Arrow Right
New

Customer Relations Specialist

We founded iamproperty to do the things no one else was doing, but we've grown b...
Location
Location
United Kingdom
Salary
Salary:
Not provided
jobs.360resourcing.co.uk Logo
360 Resourcing Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Previous experience in complaints handling and resolution is essential
  • Excellent written and verbal communication skills, with the ability to handle challenging conversations
  • Strong investigative and problem-solving skills with a customer-first approach
  • Ability to manage complex or sensitive cases through to fair resolution
  • Experience working collaboratively with teams such as Compliance, Legal and Accounts
  • Skilled at recognising trends in feedback and suggesting process improvements
  • Resilient, adaptable, and able to thrive in a fast-paced environment
Job Responsibility
Job Responsibility
  • Handle inquiries, resolve issues and ensure that every customer has a positive experience with our company
  • Effectively resolve complaints and escalations ensuring full investigation and resolution within SLAs agreed
  • Collate documents and submissions for the Property Ombudsman, Head of Compliance and Accounts when necessary
  • Respond to enquiries received via webforms, online chat, telephone calls or emails
  • Acknowledge and investigate client feedback via social media, review sites and received direct
  • Provide feedback to colleagues or Line Managers as required, including trend analysis
  • Assist with customer service projects, ad-hoc requests and activities as and when required
  • Ensure reports and files are updated, providing vital management information, and informing change
What we offer
What we offer
  • Private Counselling with a weekly confidential helpline available
  • Simplyhealth private healthcare plan
  • £150 Wellbeing Allowance per year
  • Working elsewhere policy (4 weeks per year)
  • Hybrid working
  • Buy and sell annual leave scheme (upto 3 days per year)
  • Enhanced flexibility and working from anywhere
  • Birthday off
  • Discounts portal
  • Fulltime
Read More
Arrow Right
New

Research Data Manager I

Provides data coordination and data management services for studies supported by...
Location
Location
United States , Brookline
Salary
Salary:
58676.80 - 93901.60 USD / Year
childrenshospital.org Logo
Boston Children's Hospital
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in biomedical engineering, psychology, public health, statistics, or health-sciences related field
  • Excellent computer skills, including expertise in Microsoft Office software (e.g., Excel), R and/or Python, SPSS, and familiarity with SAS and STATA
  • Experience in formatting and exporting data for statistical analyses
  • Experience in verifying, documenting, and cleaning large data files
  • Minimum of 2-year commitment is preferred
Job Responsibility
Job Responsibility
  • Manage the design, functionality, implementation, and validation of databases
  • Develop scoring scripts and automated pipelines for cleaning, preprocessing, and statistical analysis of data for funding requirements and deliverables
  • Perform data transfers from and to collaborators as part of research collaboration initiatives
  • Securely store data collected from research studies, including via OpenClinica, and REDCap
  • Develop data quality pipeline for questionnaires and EMR data
  • Optimize data storage health and maintenance using MATLAB, R, Python, SAS, SPSS, etc. with longitudinal datasets
  • Track changes to study measures, charts, and cohorts, order of tasks, version control
  • Perform regular audit checks for data, combining awareness of how the data are collected with initiation of data-driven data collection methods
  • Incorporate IRB-required changes related to data management and data sharing
  • Support PIs and project managers in preparation of progress reports and preliminary data for grant submissions, including streamlining of report generation processes
What we offer
What we offer
  • flexible schedules
  • affordable health, vision and dental insurance
  • child care and student loan subsidies
  • generous levels of time off
  • 403(b) Retirement Savings plan
  • Pension
  • Tuition and certain License and Certification Reimbursement
  • cell phone plan discounts
  • discounted rates on T-passes
  • Fulltime
Read More
Arrow Right