CrawlJobs Logo

Research Engineer, Frontier Evals & Environments

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

205000.00 - 380000.00 USD / Year

Job Description:

The Frontier Evals & Environments team builds north star model environments to drive progress towards safe AGI/ASI. This team builds ambitious environments to measure and steer our models, and creates self-improvement loops to steer our training, safety, and launch decisions.

Job Responsibility:

  • Create ambitious RL environments to push our models to their limits
  • Work on measuring frontier model capabilities, skills, and behaviors
  • Develop new methodologies for automatically exploring the behavior of these models
  • Help steer training for our largest training runs, and see the future first
  • Design scalable systems and processes to support continuous evaluation
  • Build self-improvement loops to automate model understanding

Requirements:

  • Passionate and knowledgeable about AGI/ASI measurement
  • Strong engineering and statistical analysis skills
  • Able to think outside the box and have a robust “red-teaming mindset”
  • Experienced in ML research engineering, stochastic systems, observability and monitoring, LLM-enabled applications, and/or another technical domain applicable to AI evaluations
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Nice to have:

  • First-hand experience in red-teaming systems—be it computer systems or otherwise
  • An ability to work cross-functionally
  • Excellent communication skills
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided
  • Offers Equity
  • performance-related bonus(es) for eligible employees

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer, Frontier Evals & Environments

New

Research Engineer, Frontier Evals & Environments - Finance

The Frontier Evals team builds north star model evaluations to drive progress to...
Location
Location
United States , San Francisco
Salary
Salary:
205000.00 - 380000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong engineering and statistical analysis skills (with at least 2-3 years of full-time technical experience)
  • Passionate about evals for real world applications and knowledge work
  • Detail-oriented and thorough
  • Team player / willing to do a variety of tasks to move the team forward
  • Passionate and knowledgeable about AGI/ASI measurement
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end
Job Responsibility
Job Responsibility
  • Identify important model capabilities, skills, and behaviors that are crucial to financial workflows, and design methods to quantify performance in these areas
  • Own and pursue a research agenda to identify an important model capability (especially as it relates to financial reasoning) and build evals to measure it
  • Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right
New

AI Architect

We’re hiring an AI Architect to sit at the intersection of frontier AI research,...
Location
Location
United States , San Francisco; New York
Salary
Salary:
201600.00 - 241920.00 USD / Year
scale.com Logo
Scale
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep technical background in applied AI/ML: 5–10+ years in research, engineering, solutions engineering, or technical product roles working on LLMs or multimodal systems, ideally in high-stakes, customer-facing environments
  • Hands-on experience with model improvement workflows: demonstrated experience with post-training techniques, evaluation design, benchmarking, and model quality iteration
  • Ability to work on hard, ambiguous technical problems: proven track record of partnering directly with advanced customers or research teams to scope, reason through, and execute on deep technical challenges involving frontier models
  • Strong technical fluency: you can read papers, interrogate metrics, write or review complex Python/SQL for analysis, and reason about model-data trade-offs
  • Executive presence with world-class researchers and enterprise leaders
  • excellent writing and storytelling
  • Bias to action: you ship, learn, and iterate.
Job Responsibility
Job Responsibility
  • Translate research → product: work with client side researchers on post-training, evals, safety/alignment and build the primitives, data, and tooling they need
  • Partner deeply with core customers and frontier labs: work hands-on with leading AI teams and frontier research labs to tackle hard, open-ended technical problems related to frontier model improvement, performance, and deployment
  • Shape and propose model improvement work: translate customer and research objectives into clear, technically rigorous proposals—scoping post-training, evaluation, and safety work into well-defined statements of work and execution plans
  • Translate research into production impact: collaborate with customer-side researchers on post-training, evaluations, and alignment, and help design the data, primitives, and tooling required to improve frontier models in practice
  • Own the end-to-end lifecycle: lead discovery, write crisp PRDs and technical specs, prioritize trade-offs, run experiments, ship initial solutions, and scale successful pilots into durable, repeatable offerings
  • Lead complex, high-stakes engagements: independently run technical working sessions with senior customer stakeholders
  • define success metrics
  • surface risks early
  • and drive programs to measurable outcomes
  • Partner across Scale: collaborate closely with research (agents, browser/SWE agents), platform, operations, security, and finance to deliver reliable, production-grade results for demanding customers
What we offer
What we offer
  • Comprehensive health, dental and vision coverage
  • retirement benefits
  • a learning and development stipend
  • generous PTO
  • commuter stipend
  • equity based compensation.
  • Fulltime
Read More
Arrow Right
New

Researcher, Preparedness

The Preparedness team helps us prepare for the development of increasingly capab...
Location
Location
United States , San Francisco
Salary
Salary:
295000.00 - 445000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Passionate and knowledgeable about short-term and long-term AI safety risks
  • Ability to think outside the box and have a robust 'red-teaming mindset'
  • Experience in ML research engineering, ML observability and monitoring, creating large language model-enabled applications, and/or another technical domain applicable to AI risk
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end
Job Responsibility
Job Responsibility
  • Own the scientific validity of frontier preparedness capability evaluations—designing new evals grounded in real threat models (including high-consequence domains like CBRN as well as cyber and other frontier-risk areas), and maintaining existing evals so they don't stale or silently regress
  • Define datasets, graders, rubrics, and threshold guidance, and produce auditable artifacts (evaluation cards, capability reports, system-card inputs) that leadership can trust during high-stakes launches
  • Work on identifying emerging AI safety risks and new methodologies for exploring the impact of these risks
  • Build (and then continuously refine) evaluations of frontier AI models that assess the extent of identified risks
  • Design and build scalable systems and processes that can support these kinds of evaluations
  • Contribute to the refinement of risk management and the overall development of 'best practice' guidelines for AI safety evaluations
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right
New

Research Engineering Manager - Model Training

Perplexity is seeking a Research Engineering Manager to lead the team of all-sta...
Location
Location
United States , San Francisco
Salary
Salary:
300000.00 - 470000.00 USD / Year
Perplexity
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven experience with large-scale LLMs and Deep Learning systems
  • Strong Python and PyTorch skills
  • Experience leading or managing research or engineering teams working on large-scale AI model development, including driving complex projects from idea to production
  • Self‑starter with a willingness to take ownership of tasks and navigate ambiguity in a fast‑moving environment
  • Passion for tackling challenging problems in AI model quality, speed, safety, and reliability
  • 10+ years of technical experience, with at least 2 of those years as a manager and at least 4 of those years working on large-scale AI model development
Job Responsibility
Job Responsibility
  • Lead a team of researchers and engineers focused on training SotA models for Perplexity-relevant use cases, leveraging the latest supervised and reinforcement learning techniques
  • Drive research and engineering efforts to develop production models through advanced model training and alignment techniques, including RL, SFT, and other approaches
  • Become deeply familiar with the team’s technical stack, leading from the front through hands-on technical contributions
  • Own the data, training, and eval pipelines required to train and continuously improve LLM models
  • Design and iterate on model training and finetuning algorithms (e.g., preference‑based methods, reinforcement learning from human or AI feedback) through an approach that balances scientific rigor and iteration velocity
  • Design evaluations and improve the production model training pipeline to reliably deliver models that lie on the Pareto frontier of speed and quality
  • Work closely with engineering teams to integrate in-house models into our product and rapidly iterate based on real‑world usage
  • Manage day‑to‑day execution, project planning, and prioritization for the model training team to hit ambitious quality and performance goals
What we offer
What we offer
  • Equity
  • Health
  • Dental
  • Vision
  • Retirement
  • Fitness
  • Commuter and dependent care accounts
  • Fulltime
Read More
Arrow Right
New

Store Associate

Retail Store Associates play a meaningful role within the CVS Health family. At ...
Location
Location
United States , Bradenton
Salary
Salary:
15.00 - 19.00 USD / Hour
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
April 12, 2026
Flip Icon
Requirements
Requirements
  • At least 16 years of age
  • Physical Requirements: Remaining upright on the feet, particularly for sustained periods of time
  • Lifting and exerting up to 35 lbs of force occasionally, up to 10 lbs of force frequently, and a negligible amount of force regularly to move objects to and from, including overhead lifting
  • Visual Acuity - Having close visual acuity to perform activities such as: viewing a computer terminal, reading, visual inspection involving small parts/details
Job Responsibility
Job Responsibility
  • Providing differentiated customer service by anticipating customer needs, demonstrating compassion and care in all interactions, and actively identifying and resolving potential service issues
  • Focusing on the customer by giving a warm and friendly greeting, maintaining eye contact and offering help locating additional items, when needed
  • Accurately perform cashier duties - handling cash, checks and credit card transactions with precision while following company policies and procedures
  • Maintaining the sales floor by restocking shelves, checking in vendors, updating pricing information and completing inventory management tasks as directed by store manager
  • Supporting opening and closing store activities, when needed
  • Providing customer support to all departments, including photo and beauty, ensuring departments are fully stocked and operational while remaining current with all updated services and tools
  • Assisting pharmacy personnel when needed, including working regular shifts in the pharmacy as part of opportunities for growth and career development
  • Embracing and advocating for new CVS services and loyalty programs that support our purpose of helping people on their path to better health
What we offer
What we offer
  • Affordable medical plan options
  • 401(k) plan (including matching company contributions)
  • Employee stock purchase plan
  • No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
  • Paid time off
  • Flexible work schedules
  • Family leave
  • Dependent care resources
  • Colleague assistance programs
  • Tuition assistance
  • Parttime
Read More
Arrow Right
New

Grad Intern - Marketing Cardio Renal Metabolic

Do more with the knowledge you’re working hard to acquire and the passion you al...
Location
Location
Italy , Milan
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master degree in scientific or economic disciplines, including project management
  • Excellent knowledge of the Microsoft Office suite
  • Strong written and verbal communication skills
  • Good project management, analytical abilities, strategic thinking, creativity, and a collaborative team-oriented attitude
  • Fluent English, both oral and written
Job Responsibility
Job Responsibility
  • Be part of the Cardio-Renal-Metabolic Marketing team of Amgen Italy
  • Support the Marketing team, actively giving contributions to all department activities
  • Involved in marketing projects and campaigns under the supervision of a Brand Lead
  • Support in defining the communication plan and executing related promotional materials, ensuring quality and consistency of contents
  • Design and implementation of multi-channel marketing campaigns
  • Develop analyses to assess the effectiveness of communication strategies, highlighting successes and optimization opportunities
  • Collaborate with cross-functional teams in organizing scientific events, internal meetings, and training for the field force
What we offer
What we offer
  • Full support and career-development resources to expand your skills, enhance your expertise, and maximize your potential along your career journey
  • Diverse and inclusive community of belonging, where teammates are empowered to bring ideas to the table and act
Read More
Arrow Right
New

Safety engineer

Are you passionate about safety, environmental protection, and functional safety...
Location
Location
United Kingdom , Bristol
Salary
Salary:
46400.00 GBP / Year
des.mod.uk Logo
Defence Equipment & Support
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Must have either demonstrable experience of audit and assurance activities to an applicable standard (e.g. ISO 9001, 45001 or 14001)
  • An understanding of Systems Safety in the defence environment, e.g. DEF STAN 00-056
  • Hold a Science, Technology, Engineering and/or Mathematics based qualification at Regulated Qualifications Framework (RQF) Level 4 or demonstrable relevant experience
  • Be professionally registered or intend to be professionally registered with a relevant Professional body/institution related to your discipline, as either: Registered Scientist (RSci) or Incorporated Engineer (IEng)
  • Must have lived in the UK for the last 5 years
  • Must obtain Security Check (SC) security clearance without any caveats
Job Responsibility
Job Responsibility
  • Lead technical safety assurance activities and collaborate with stakeholders to assure the systems and equipment we deliver are safe to operate
  • Working in a highly regulated area, educate others with relevant legislation, regulation, policy, processes, and standards to provide assurance against technical documentation for a sub-system
  • Identify and analyse safety acquisition risk reduction measures, assuring that these are adequately documented and managed
  • Develops and maintains ISEA safety assurance plans, monitors compliance, and ensures that safety assurance is sufficient to demonstrate that systems are safe to operate
  • Also produces, reviews, endorses, and recommends acceptance of safety related artefacts, which includes when equipment/system is operational or being modified
What we offer
What we offer
  • Ministry of Defence contributes £13,442 towards you being a member of the Civil Service Defined Benefit Pension scheme
  • 25 days’ annual leave +1 day a year up to 30 days, 8 bank holidays and a day off for the King’s birthday
  • Market-leading average employer pension contribution of 28.97%
  • Annual performance-based bonus and recognition awards
  • Access to specialist training and funded qualifications
  • Support for progression
  • Huge range of discounts
  • Volunteering days
  • Enhanced parental leave schemes
Read More
Arrow Right
New

Receptionist/Administrator

Help us to deliver great primary care by improving access, outcomes and patient ...
Location
Location
United Kingdom , Great Missenden
Salary
Salary:
24960.00 GBP / Year
operosehealth.co.uk Logo
Operose Health
Expiration Date
February 25, 2026
Flip Icon
Requirements
Requirements
  • Reception or customer care experience
  • Excellent communicator both spoken and written
  • Basic PC skills such as Word, Excel and email
  • Able to work within processes, procedures and maintain confidentiality and data security
  • Previous experience of working in the NHS is welcome but not essential
Job Responsibility
Job Responsibility
  • Responding to patient queries and liaising with the wider primary care team
  • Managing appointment requests
  • Signposting patients to our range of services
  • Maintaining patient records and confidentiality
  • Emailing, scanning and coding clinical correspondence
  • Processing prescriptions requests
  • Utilising other information systems to support efficient workflow processes
What we offer
What we offer
  • 27 days annual leave plus bank holidays pro rata
  • Access to bespoke learning management system and annual formative clinical assessments
  • Opportunities to specialise and develop
  • Car benefit scheme – specialising in electric vehicles
  • Cycle to work scheme
  • Travel season ticket loans
  • Discount cards
  • Employee wellbeing services including free yoga videos and employee wellbeing app
  • Parttime
!
Read More
Arrow Right