CrawlJobs Logo

Research Engineer, Frontier Evals & Environments

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

205000.00 - 380000.00 USD / Year

Job Description:

The Frontier Evals & Environments team builds north star model environments to drive progress towards safe AGI/ASI. This team builds ambitious environments to measure and steer our models, and creates self-improvement loops to steer our training, safety, and launch decisions.

Job Responsibility:

  • Create ambitious RL environments to push our models to their limits
  • Work on measuring frontier model capabilities, skills, and behaviors
  • Develop new methodologies for automatically exploring the behavior of these models
  • Help steer training for our largest training runs, and see the future first
  • Design scalable systems and processes to support continuous evaluation
  • Build self-improvement loops to automate model understanding

Requirements:

  • Passionate and knowledgeable about AGI/ASI measurement
  • Strong engineering and statistical analysis skills
  • Able to think outside the box and have a robust “red-teaming mindset”
  • Experienced in ML research engineering, stochastic systems, observability and monitoring, LLM-enabled applications, and/or another technical domain applicable to AI evaluations
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Nice to have:

  • First-hand experience in red-teaming systems—be it computer systems or otherwise
  • An ability to work cross-functionally
  • Excellent communication skills
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided
  • Offers Equity
  • performance-related bonus(es) for eligible employees

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 31694 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer, Frontier Evals & Environments

Research Engineer, Frontier Evals & Environments - Finance

The Frontier Evals team builds north star model evaluations to drive progress to...
Location
Location
United States , San Francisco
Salary
Salary:
205000.00 - 380000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong engineering and statistical analysis skills (with at least 2-3 years of full-time technical experience)
  • Passionate about evals for real world applications and knowledge work
  • Detail-oriented and thorough
  • Team player / willing to do a variety of tasks to move the team forward
  • Passionate and knowledgeable about AGI/ASI measurement
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end
Job Responsibility
Job Responsibility
  • Identify important model capabilities, skills, and behaviors that are crucial to financial workflows, and design methods to quantify performance in these areas
  • Own and pursue a research agenda to identify an important model capability (especially as it relates to financial reasoning) and build evals to measure it
  • Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

AI Architect

We’re hiring an AI Architect to sit at the intersection of frontier AI research,...
Location
Location
United States , San Francisco; New York
Salary
Salary:
201600.00 - 241920.00 USD / Year
scale.com Logo
Scale
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep technical background in applied AI/ML: 5–10+ years in research, engineering, solutions engineering, or technical product roles working on LLMs or multimodal systems, ideally in high-stakes, customer-facing environments
  • Hands-on experience with model improvement workflows: demonstrated experience with post-training techniques, evaluation design, benchmarking, and model quality iteration
  • Ability to work on hard, ambiguous technical problems: proven track record of partnering directly with advanced customers or research teams to scope, reason through, and execute on deep technical challenges involving frontier models
  • Strong technical fluency: you can read papers, interrogate metrics, write or review complex Python/SQL for analysis, and reason about model-data trade-offs
  • Executive presence with world-class researchers and enterprise leaders
  • excellent writing and storytelling
  • Bias to action: you ship, learn, and iterate.
Job Responsibility
Job Responsibility
  • Translate research → product: work with client side researchers on post-training, evals, safety/alignment and build the primitives, data, and tooling they need
  • Partner deeply with core customers and frontier labs: work hands-on with leading AI teams and frontier research labs to tackle hard, open-ended technical problems related to frontier model improvement, performance, and deployment
  • Shape and propose model improvement work: translate customer and research objectives into clear, technically rigorous proposals—scoping post-training, evaluation, and safety work into well-defined statements of work and execution plans
  • Translate research into production impact: collaborate with customer-side researchers on post-training, evaluations, and alignment, and help design the data, primitives, and tooling required to improve frontier models in practice
  • Own the end-to-end lifecycle: lead discovery, write crisp PRDs and technical specs, prioritize trade-offs, run experiments, ship initial solutions, and scale successful pilots into durable, repeatable offerings
  • Lead complex, high-stakes engagements: independently run technical working sessions with senior customer stakeholders
  • define success metrics
  • surface risks early
  • and drive programs to measurable outcomes
  • Partner across Scale: collaborate closely with research (agents, browser/SWE agents), platform, operations, security, and finance to deliver reliable, production-grade results for demanding customers
What we offer
What we offer
  • Comprehensive health, dental and vision coverage
  • retirement benefits
  • a learning and development stipend
  • generous PTO
  • commuter stipend
  • equity based compensation.
  • Fulltime
Read More
Arrow Right

Researcher, Preparedness

The Preparedness team helps us prepare for the development of increasingly capab...
Location
Location
United States , San Francisco
Salary
Salary:
295000.00 - 445000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Passionate and knowledgeable about short-term and long-term AI safety risks
  • Ability to think outside the box and have a robust 'red-teaming mindset'
  • Experience in ML research engineering, ML observability and monitoring, creating large language model-enabled applications, and/or another technical domain applicable to AI risk
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end
Job Responsibility
Job Responsibility
  • Own the scientific validity of frontier preparedness capability evaluations—designing new evals grounded in real threat models (including high-consequence domains like CBRN as well as cyber and other frontier-risk areas), and maintaining existing evals so they don't stale or silently regress
  • Define datasets, graders, rubrics, and threshold guidance, and produce auditable artifacts (evaluation cards, capability reports, system-card inputs) that leadership can trust during high-stakes launches
  • Work on identifying emerging AI safety risks and new methodologies for exploring the impact of these risks
  • Build (and then continuously refine) evaluations of frontier AI models that assess the extent of identified risks
  • Design and build scalable systems and processes that can support these kinds of evaluations
  • Contribute to the refinement of risk management and the overall development of 'best practice' guidelines for AI safety evaluations
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

Research Engineering Manager - Model Training

Perplexity is seeking a Research Engineering Manager to lead the team of all-sta...
Location
Location
United States , San Francisco
Salary
Salary:
300000.00 - 470000.00 USD / Year
perplexity.ai Logo
Perplexity
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience with large-scale LLMs and Deep Learning systems
  • Strong Python and PyTorch skills
  • Experience leading or managing research or engineering teams working on large-scale AI model development, including driving complex projects from idea to production
  • Self‑starter with a willingness to take ownership of tasks and navigate ambiguity in a fast‑moving environment
  • Passion for tackling challenging problems in AI model quality, speed, safety, and reliability
  • 10+ years of technical experience, with at least 2 of those years as a manager and at least 4 of those years working on large-scale AI model development
Job Responsibility
Job Responsibility
  • Lead a team of researchers and engineers focused on training SotA models for Perplexity-relevant use cases, leveraging the latest supervised and reinforcement learning techniques
  • Drive research and engineering efforts to develop production models through advanced model training and alignment techniques, including RL, SFT, and other approaches
  • Become deeply familiar with the team’s technical stack, leading from the front through hands-on technical contributions
  • Own the data, training, and eval pipelines required to train and continuously improve LLM models
  • Design and iterate on model training and finetuning algorithms (e.g., preference‑based methods, reinforcement learning from human or AI feedback) through an approach that balances scientific rigor and iteration velocity
  • Design evaluations and improve the production model training pipeline to reliably deliver models that lie on the Pareto frontier of speed and quality
  • Work closely with engineering teams to integrate in-house models into our product and rapidly iterate based on real‑world usage
  • Manage day‑to‑day execution, project planning, and prioritization for the model training team to hit ambitious quality and performance goals
What we offer
What we offer
  • Equity
  • Health
  • Dental
  • Vision
  • Retirement
  • Fitness
  • Commuter and dependent care accounts
  • Fulltime
Read More
Arrow Right

Product Manager, Central Products

Meta Product Managers work with cross-functional teams of engineers, designers, ...
Location
Location
United States , Menlo Park
Salary
Salary:
205000.00 - 277000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years product management and/or Product Design
  • 10+ years of experience working collaboratively with engineering, design and user research teams
  • Experience navigating through the full product life-cycle, integrating customer feedback into product requirements, driving prioritization, and pre- and post-launch execution
  • Critical thinking and analytical leadership experience
  • Demonstrated proficiency using AI-enabled tools to build product artifacts at scale
  • Experience developing and championing AI-native strategies across organizations
  • Experience presenting to executive audiences
  • BA/BS in Computer Science or related field
Job Responsibility
Job Responsibility
  • Is the primary driver for identifying significant near and long-term opportunities in a large Product area, and driving product mission, strategies, and roadmaps in the context of broader organizational strategies and goals
  • Generate buy-in and drive consensus across organizations. Bring clarity and structure to ambiguous opportunities. Consistently demonstrate initiative and execute with limited oversight
  • Critically evaluate when AI is (and isn't) the optimal solution at portfolio level, setting the standard for rigorous tradeoff analysis
  • Translate AI capabilities into compelling, differentiated product visions that define market categories
  • Champion AI-native strategies including comprehensive evals and data strategies that enable org-wide continuous improvement
  • Drive product development with teams of engineers and designers, while maintaining team health
  • Work closely with cross-functional teams to drive product mission, define product requirements, coordinate resources from other groups (design, legal, etc.), develop roadmaps, and guide the team through key milestones
  • Reimagine workflows, responsibly using AI tools to transform team velocity and capability at organizational scale
  • Foster a culture of rapid experimentation and learning that becomes a competitive advantage
  • Scale AI best practices (including responsible AI use), workflows, and artifacts across the organization so capability compounds exponentially
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Product Manager - Growth & Monetization

At Meta, we're shaping innovative experiences in service of giving people the po...
Location
Location
Singapore
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years product management and/or Product Design
  • 10+ years of experience working collaboratively with engineering, design and user research teams
  • Experience navigating through the full product life-cycle, integrating customer feedback into product requirements, driving prioritization, and pre- and post-launch execution
  • Critical thinking and analytical leadership experience
  • Demonstrated proficiency using AI-enabled tools to build product artifacts at scale
  • Experience developing and championing AI-native strategies across organizations
  • Experience presenting to executive audiences
  • BA/BS in Computer Science or related field
Job Responsibility
Job Responsibility
  • Is the primary driver for identifying significant near and long-term opportunities in a large Product area, and driving product mission, strategies and roadmaps in the context of broader organizational strategies and goals
  • Generate buy-in and drive consensus across organizations. Bring clarity and structure to ambiguous opportunities. Consistently demonstrate initiative and execute with limited oversight
  • Critically evaluate when AI is (and isn't) the optimal solution at portfolio level, setting the standard for rigorous tradeoff analysis
  • Translate AI capabilities into compelling, differentiated product visions that define market categories
  • Champion AI-native strategies including comprehensive evals and data strategies that enable org-wide continuous improvement
  • Drive product development with teams of engineers and designers, while maintaining team health
  • Work closely with cross-functional teams to drive product mission, define product requirements, coordinate resources from other groups (design, legal, etc.), develop roadmaps, and guide the team through key milestones
  • Reimagine workflows, responsibly using AI tools to transform team velocity and capability at organizational scale
  • Foster a mindset of rapid experimentation and learning that becomes a competitive advantage
  • Scale AI best practices (including responsible AI use), workflows, and artifacts across the organization so capability compounds exponentially
Read More
Arrow Right
New

Bi & Bigdata Quality Specialist

At Vodafone, we’re not just shaping the future of connectivity for our customers...
Location
Location
Egypt , Giza
Salary
Salary:
Not provided
vodafone.com Logo
Vodafone
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor of computer Science or related domains
  • 0 to 1 years in engineering, specifically in building quality gates for data pipelines, ensuring data accuracy, and administering distributed systems
  • Experience in bigdata systems and technologies such as Java, Python, NoSQL, Node.js, Angular, Kafka, Yarn, Hbase, Zookeeper
  • Understanding of distributed systems like Hadoop (HDFS)
  • Knowledge of data processing frameworks, such as Apache Hadoop (MapReduce), Apache Spark, Apache Kafka
  • Understanding of data warehousing concepts and technologies
  • Familiarity with ETL tools and processes for efficiently moving and transforming data
Job Responsibility
Job Responsibility
  • Delivering end to end development solutions for BI and Big Data quality tasks, including pipelines, data trends and dashboards
  • Managing the system optimization, monitoring workflow performance, administering DWH applications and Big Data clusters
  • Managing vendor communications and handling system upgrades
  • Design, develop, and implement solutions to monitor, validate, and improve data accuracy, consistency, and completeness
  • Build data quality gates to ensure and maintain scalable data pipelines for processing and validating large datasets efficiently
  • Develop and enhance ETL processes to ensure seamless data movement across systems while maintaining data quality dashboards
  • Ensure adherence to data governance policies, regulatory standards, and best practices for data management
  • Investigate data anomalies, identify root causes, and implement corrective measures proactively
  • Optimize query performance, database indexing, and system configurations to enhance efficiency
  • Work closely with data analysts, engineers, business teams, and vendors to ensure data quality objectives are met
  • Fulltime
Read More
Arrow Right
New

Family Law Partner

Join the largest, fastest-growing specialist family law firm in the country. We ...
Location
Location
United Kingdom , Cambridge
Salary
Salary:
90000.00 GBP / Year
stowefamilylaw.co.uk Logo
Stowe Family Law
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Commercially minded and able to spot opportunities to improve client service
  • Ambitious and want to reach potential
  • Able to convert new clients and lead a range of complex finance and children cases
Job Responsibility
Job Responsibility
  • Convert new clients
  • Lead a range of complex finance and children cases
What we offer
What we offer
  • Bonus
  • Wellbeing culture including Mental Wellbeing days and access to counselling sessions
  • Volunteering leave
  • Diversity public holidays
  • 26 days holiday
  • Enhanced adoption, maternity and paternity pay
  • Paid leave for fertility treatment
  • Emergency dependants leave
  • Bereavement leave
  • Medicash health insurance - 24/7 GP’s, dental, counselling, gym discounts
  • Fulltime
Read More
Arrow Right