Research Engineer, Frontier Evals & Environments Job at OpenAI (San Francisco)

New

Research Engineer, Frontier Evals & Environments - Finance

The Frontier Evals team builds north star model evaluations to drive progress to...

Location

United States , San Francisco

Salary:

205000.00 - 380000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

Strong engineering and statistical analysis skills (with at least 2-3 years of full-time technical experience)
Passionate about evals for real world applications and knowledge work
Detail-oriented and thorough
Team player / willing to do a variety of tasks to move the team forward
Passionate and knowledgeable about AGI/ASI measurement
Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Job Responsibility

Identify important model capabilities, skills, and behaviors that are crucial to financial workflows, and design methods to quantify performance in these areas
Own and pursue a research agenda to identify an important model capability (especially as it relates to financial reasoning) and build evals to measure it
Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

New

AI Architect

We’re hiring an AI Architect to sit at the intersection of frontier AI research,...

Location

United States , San Francisco; New York

Salary:

201600.00 - 241920.00 USD / Year

Scale

Expiration Date

Until further notice

Requirements

Deep technical background in applied AI/ML: 5–10+ years in research, engineering, solutions engineering, or technical product roles working on LLMs or multimodal systems, ideally in high-stakes, customer-facing environments
Hands-on experience with model improvement workflows: demonstrated experience with post-training techniques, evaluation design, benchmarking, and model quality iteration
Ability to work on hard, ambiguous technical problems: proven track record of partnering directly with advanced customers or research teams to scope, reason through, and execute on deep technical challenges involving frontier models
Strong technical fluency: you can read papers, interrogate metrics, write or review complex Python/SQL for analysis, and reason about model-data trade-offs
Executive presence with world-class researchers and enterprise leaders
excellent writing and storytelling
Bias to action: you ship, learn, and iterate.

Job Responsibility

Translate research → product: work with client side researchers on post-training, evals, safety/alignment and build the primitives, data, and tooling they need
Partner deeply with core customers and frontier labs: work hands-on with leading AI teams and frontier research labs to tackle hard, open-ended technical problems related to frontier model improvement, performance, and deployment
Shape and propose model improvement work: translate customer and research objectives into clear, technically rigorous proposals—scoping post-training, evaluation, and safety work into well-defined statements of work and execution plans
Translate research into production impact: collaborate with customer-side researchers on post-training, evaluations, and alignment, and help design the data, primitives, and tooling required to improve frontier models in practice
Own the end-to-end lifecycle: lead discovery, write crisp PRDs and technical specs, prioritize trade-offs, run experiments, ship initial solutions, and scale successful pilots into durable, repeatable offerings
Lead complex, high-stakes engagements: independently run technical working sessions with senior customer stakeholders
define success metrics
surface risks early
and drive programs to measurable outcomes
Partner across Scale: collaborate closely with research (agents, browser/SWE agents), platform, operations, security, and finance to deliver reliable, production-grade results for demanding customers

What we offer

Comprehensive health, dental and vision coverage
retirement benefits
a learning and development stipend
generous PTO
commuter stipend
equity based compensation.

Fulltime

New

Researcher, Preparedness

The Preparedness team helps us prepare for the development of increasingly capab...

Location

United States , San Francisco

Salary:

295000.00 - 445000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

Passionate and knowledgeable about short-term and long-term AI safety risks
Ability to think outside the box and have a robust 'red-teaming mindset'
Experience in ML research engineering, ML observability and monitoring, creating large language model-enabled applications, and/or another technical domain applicable to AI risk
Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Job Responsibility

Own the scientific validity of frontier preparedness capability evaluations—designing new evals grounded in real threat models (including high-consequence domains like CBRN as well as cyber and other frontier-risk areas), and maintaining existing evals so they don't stale or silently regress
Define datasets, graders, rubrics, and threshold guidance, and produce auditable artifacts (evaluation cards, capability reports, system-card inputs) that leadership can trust during high-stakes launches
Work on identifying emerging AI safety risks and new methodologies for exploring the impact of these risks
Build (and then continuously refine) evaluations of frontier AI models that assess the extent of identified risks
Design and build scalable systems and processes that can support these kinds of evaluations
Contribute to the refinement of risk management and the overall development of 'best practice' guidelines for AI safety evaluations

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

New

Research Engineering Manager - Model Training

Perplexity is seeking a Research Engineering Manager to lead the team of all-sta...

Location

United States , San Francisco

Salary:

300000.00 - 470000.00 USD / Year

Perplexity

Expiration Date

Until further notice

Requirements

Proven experience with large-scale LLMs and Deep Learning systems
Strong Python and PyTorch skills
Experience leading or managing research or engineering teams working on large-scale AI model development, including driving complex projects from idea to production
Self‑starter with a willingness to take ownership of tasks and navigate ambiguity in a fast‑moving environment
Passion for tackling challenging problems in AI model quality, speed, safety, and reliability
10+ years of technical experience, with at least 2 of those years as a manager and at least 4 of those years working on large-scale AI model development

Job Responsibility

Lead a team of researchers and engineers focused on training SotA models for Perplexity-relevant use cases, leveraging the latest supervised and reinforcement learning techniques
Drive research and engineering efforts to develop production models through advanced model training and alignment techniques, including RL, SFT, and other approaches
Become deeply familiar with the team’s technical stack, leading from the front through hands-on technical contributions
Own the data, training, and eval pipelines required to train and continuously improve LLM models
Design and iterate on model training and finetuning algorithms (e.g., preference‑based methods, reinforcement learning from human or AI feedback) through an approach that balances scientific rigor and iteration velocity
Design evaluations and improve the production model training pipeline to reliably deliver models that lie on the Pareto frontier of speed and quality
Work closely with engineering teams to integrate in-house models into our product and rapidly iterate based on real‑world usage
Manage day‑to‑day execution, project planning, and prioritization for the model training team to hit ambitious quality and performance goals

What we offer

Equity
Health
Dental
Vision
Retirement
Fitness
Commuter and dependent care accounts

Fulltime

New

Store Associate

Retail Store Associates play a meaningful role within the CVS Health family. At ...

Location

United States , Bradenton

Salary:

15.00 - 19.00 USD / Hour

CVS Health

Expiration Date

April 12, 2026

Requirements

At least 16 years of age
Physical Requirements: Remaining upright on the feet, particularly for sustained periods of time
Lifting and exerting up to 35 lbs of force occasionally, up to 10 lbs of force frequently, and a negligible amount of force regularly to move objects to and from, including overhead lifting
Visual Acuity - Having close visual acuity to perform activities such as: viewing a computer terminal, reading, visual inspection involving small parts/details

Job Responsibility

Providing differentiated customer service by anticipating customer needs, demonstrating compassion and care in all interactions, and actively identifying and resolving potential service issues
Focusing on the customer by giving a warm and friendly greeting, maintaining eye contact and offering help locating additional items, when needed
Accurately perform cashier duties - handling cash, checks and credit card transactions with precision while following company policies and procedures
Maintaining the sales floor by restocking shelves, checking in vendors, updating pricing information and completing inventory management tasks as directed by store manager
Supporting opening and closing store activities, when needed
Providing customer support to all departments, including photo and beauty, ensuring departments are fully stocked and operational while remaining current with all updated services and tools
Assisting pharmacy personnel when needed, including working regular shifts in the pharmacy as part of opportunities for growth and career development
Embracing and advocating for new CVS services and loyalty programs that support our purpose of helping people on their path to better health

What we offer

Affordable medical plan options
401(k) plan (including matching company contributions)
Employee stock purchase plan
No-cost programs for all colleagues including wellness screenings, tobacco cessation and weight management programs, confidential counseling and financial coaching
Paid time off
Flexible work schedules
Family leave
Dependent care resources
Colleague assistance programs
Tuition assistance

Parttime

New

Grad Intern - Marketing Cardio Renal Metabolic

Do more with the knowledge you’re working hard to acquire and the passion you al...

Location

Italy , Milan

Salary:

Not provided

Amgen

Expiration Date

Until further notice

Requirements

Master degree in scientific or economic disciplines, including project management
Excellent knowledge of the Microsoft Office suite
Strong written and verbal communication skills
Good project management, analytical abilities, strategic thinking, creativity, and a collaborative team-oriented attitude
Fluent English, both oral and written

Job Responsibility

Be part of the Cardio-Renal-Metabolic Marketing team of Amgen Italy
Support the Marketing team, actively giving contributions to all department activities
Involved in marketing projects and campaigns under the supervision of a Brand Lead
Support in defining the communication plan and executing related promotional materials, ensuring quality and consistency of contents
Design and implementation of multi-channel marketing campaigns
Develop analyses to assess the effectiveness of communication strategies, highlighting successes and optimization opportunities
Collaborate with cross-functional teams in organizing scientific events, internal meetings, and training for the field force

What we offer

Full support and career-development resources to expand your skills, enhance your expertise, and maximize your potential along your career journey
Diverse and inclusive community of belonging, where teammates are empowered to bring ideas to the table and act

New

Safety engineer

Are you passionate about safety, environmental protection, and functional safety...

Location

United Kingdom , Bristol

Salary:

46400.00 GBP / Year

Defence Equipment & Support

Expiration Date

Until further notice

Requirements

Must have either demonstrable experience of audit and assurance activities to an applicable standard (e.g. ISO 9001, 45001 or 14001)
An understanding of Systems Safety in the defence environment, e.g. DEF STAN 00-056
Hold a Science, Technology, Engineering and/or Mathematics based qualification at Regulated Qualifications Framework (RQF) Level 4 or demonstrable relevant experience
Be professionally registered or intend to be professionally registered with a relevant Professional body/institution related to your discipline, as either: Registered Scientist (RSci) or Incorporated Engineer (IEng)
Must have lived in the UK for the last 5 years
Must obtain Security Check (SC) security clearance without any caveats

Job Responsibility

Lead technical safety assurance activities and collaborate with stakeholders to assure the systems and equipment we deliver are safe to operate
Working in a highly regulated area, educate others with relevant legislation, regulation, policy, processes, and standards to provide assurance against technical documentation for a sub-system
Identify and analyse safety acquisition risk reduction measures, assuring that these are adequately documented and managed
Develops and maintains ISEA safety assurance plans, monitors compliance, and ensures that safety assurance is sufficient to demonstrate that systems are safe to operate
Also produces, reviews, endorses, and recommends acceptance of safety related artefacts, which includes when equipment/system is operational or being modified

What we offer

Ministry of Defence contributes £13,442 towards you being a member of the Civil Service Defined Benefit Pension scheme
25 days’ annual leave +1 day a year up to 30 days, 8 bank holidays and a day off for the King’s birthday
Market-leading average employer pension contribution of 28.97%
Annual performance-based bonus and recognition awards
Access to specialist training and funded qualifications
Support for progression
Huge range of discounts
Volunteering days
Enhanced parental leave schemes

New

Receptionist/Administrator

Help us to deliver great primary care by improving access, outcomes and patient ...

Location

United Kingdom , Great Missenden

Salary:

24960.00 GBP / Year

Operose Health

Expiration Date

February 25, 2026

Requirements

Reception or customer care experience
Excellent communicator both spoken and written
Basic PC skills such as Word, Excel and email
Able to work within processes, procedures and maintain confidentiality and data security
Previous experience of working in the NHS is welcome but not essential

Job Responsibility

Responding to patient queries and liaising with the wider primary care team
Managing appointment requests
Signposting patients to our range of services
Maintaining patient records and confidentiality
Emailing, scanning and coding clinical correspondence
Processing prescriptions requests
Utilising other information systems to support efficient workflow processes

What we offer

27 days annual leave plus bank holidays pro rata
Access to bespoke learning management system and annual formative clinical assessments
Opportunities to specialise and develop
Car benefit scheme – specialising in electric vehicles
Cycle to work scheme
Travel season ticket loans
Discount cards
Employee wellbeing services including free yoga videos and employee wellbeing app

Parttime

!

Research Engineer, Frontier Evals & Environments

OpenAI

Location:
United States , San Francisco

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
February 21, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Research Engineer, Frontier Evals & Environments