CrawlJobs Logo

Research Engineer, Frontier Evals & Environments - Finance

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

205000.00 - 380000.00 USD / Year

Job Description:

The Frontier Evals team builds north star model evaluations to drive progress towards safe AGI/ASI. This team builds ambitious evaluations to measure and steer our models, and creates self-improvement loops to steer our training, safety, and launch decisions. Some of the team's open-sourced evaluations include SWE-bench Verified, MLE-bench, PaperBench, and SWE-Lancer, and the team built and ran frontier evaluations for GPT4o, o1, o3, GPT 4.5, ChatGPT Agent, and GPT5. If you are interested in feeling firsthand the fast progress of our models, and steering them towards good, this is the team for you.

Job Responsibility:

  • Identify important model capabilities, skills, and behaviors that are crucial to financial workflows, and design methods to quantify performance in these areas
  • Own and pursue a research agenda to identify an important model capability (especially as it relates to financial reasoning) and build evals to measure it
  • Continuously refine evaluations of frontier AI models to assess the extent of frontier capabilities

Requirements:

  • Strong engineering and statistical analysis skills (with at least 2-3 years of full-time technical experience)
  • Passionate about evals for real world applications and knowledge work
  • Detail-oriented and thorough
  • Team player / willing to do a variety of tasks to move the team forward
  • Passionate and knowledgeable about AGI/ASI measurement
  • Able to operate effectively in a dynamic and extremely fast-paced research environment as well as scope and deliver projects end-to-end

Nice to have:

  • An ability to work cross-functionally
  • Excellent communication skills
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided
  • Offers Equity

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Research Engineer, Frontier Evals & Environments - Finance

AI Architect

We’re hiring an AI Architect to sit at the intersection of frontier AI research,...
Location
Location
United States , San Francisco; New York
Salary
Salary:
201600.00 - 241920.00 USD / Year
scale.com Logo
Scale
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep technical background in applied AI/ML: 5–10+ years in research, engineering, solutions engineering, or technical product roles working on LLMs or multimodal systems, ideally in high-stakes, customer-facing environments
  • Hands-on experience with model improvement workflows: demonstrated experience with post-training techniques, evaluation design, benchmarking, and model quality iteration
  • Ability to work on hard, ambiguous technical problems: proven track record of partnering directly with advanced customers or research teams to scope, reason through, and execute on deep technical challenges involving frontier models
  • Strong technical fluency: you can read papers, interrogate metrics, write or review complex Python/SQL for analysis, and reason about model-data trade-offs
  • Executive presence with world-class researchers and enterprise leaders
  • excellent writing and storytelling
  • Bias to action: you ship, learn, and iterate.
Job Responsibility
Job Responsibility
  • Translate research → product: work with client side researchers on post-training, evals, safety/alignment and build the primitives, data, and tooling they need
  • Partner deeply with core customers and frontier labs: work hands-on with leading AI teams and frontier research labs to tackle hard, open-ended technical problems related to frontier model improvement, performance, and deployment
  • Shape and propose model improvement work: translate customer and research objectives into clear, technically rigorous proposals—scoping post-training, evaluation, and safety work into well-defined statements of work and execution plans
  • Translate research into production impact: collaborate with customer-side researchers on post-training, evaluations, and alignment, and help design the data, primitives, and tooling required to improve frontier models in practice
  • Own the end-to-end lifecycle: lead discovery, write crisp PRDs and technical specs, prioritize trade-offs, run experiments, ship initial solutions, and scale successful pilots into durable, repeatable offerings
  • Lead complex, high-stakes engagements: independently run technical working sessions with senior customer stakeholders
  • define success metrics
  • surface risks early
  • and drive programs to measurable outcomes
  • Partner across Scale: collaborate closely with research (agents, browser/SWE agents), platform, operations, security, and finance to deliver reliable, production-grade results for demanding customers
What we offer
What we offer
  • Comprehensive health, dental and vision coverage
  • retirement benefits
  • a learning and development stipend
  • generous PTO
  • commuter stipend
  • equity based compensation.
  • Fulltime
Read More
Arrow Right

Principal Architect - Microsoft Threat Protection

Israel is the biggest Microsoft center of excellence in the security domain and ...
Location
Location
Israel , Tel Aviv, Herzliya
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years of professional experience as a software engineer building large-scale systems
  • Experience working closely with senior executives, providing strategic and technical guidance
  • Proven leadership and communication skills, with the ability to inspire and motivate others around you with proven record experience as an architect or tech lead of a group of 100+ engineers
  • Deep expertise in software development, architecture, and design principles, including experience with distributed computing platforms for high-scale systems and massive amounts of data
  • Experience with adopting AI tools and automation to improve engineering velocity, code quality, and operational efficiency. While taking into consideration responsible AI principles.Experience leveraging AI tools and automation to enhance engineering velocity, code quality, and operational efficiency, with a strong focus on responsible AI principles
Job Responsibility
Job Responsibility
  • Drive and oversee technical initiatives across organizations that deliver substantial business value, utilizing the work of others and shaping the direction of the business
  • Coordinate initiatives, managing dependencies, and ensuring timely delivery of high-quality solutions
  • Serve as a key technical advisor and confidant to leaders in the team, aligning closely with their strategies, values, and priorities and shaping their strategy
  • Attend staff meetings, providing technical insights and helping to shape strategic decisions
  • Represent the org in various forums, ensuring consistent communication and implementation of leadership directives
  • Coordinate initiatives, managing dependencies and ensuring timely delivery of high-quality solutions
  • Provide clear and concise communication of technical issues and solutions to stakeholders at all levels
  • Provide technical direction and mentorship to engineering teams and leads, fostering a culture of innovation and engineering excellence
  • Promote a culture of continuous learning and improvement, encouraging professional development and knowledge sharing
  • Ensure technical decisions align with business goals and promote long-term sustainability and scalability
  • Fulltime
Read More
Arrow Right

Senior Data Center Technician

As a Microsoft Senior Data Center Technician (DCT), you will demonstrate experti...
Location
Location
Italy , Milan
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Completed High/Secondary School, GED, an apprenticeship/vocational qualification, or equivalent experience and basic knowledge of computer hardware and components
  • Experienced in diagnostics, troubleshooting, route cause analysis of hardware deployments
  • Experience of Decommissioning and Upgrading hardware
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • This role requires you to be on-site, and the location has limited public transport. Your own form of transport is advised
Job Responsibility
Job Responsibility
  • Performs diagnostics and troubleshooting following standard procedures, quickly identifies the cause(s) of issues, and replaces faulty components with minimal customer and business disruption
  • Performs post-execution quality checks and verifies that grounding, staging, labeling, and cabling are set up properly according to safety protocols, deployment standards, and planned Network Design Tasks (NDTs)
  • Decommissions hardware for simple changes and refreshes (e.g., memory upgrades, rebuilds) following standard procedures with minimal guidance
  • Follows procedures to communicate, report, and escalate incidents to appropriate Microsoft data center operations management units, Technician Leads, and engineering specialists
  • Assists and provides guidance to other technicians to complete challenging or complex tasks
  • Completes required training aligned to the role and workload
  • observes more experienced technicians to gain hands-on experience and relevant on-the-job training
  • Contributes to a positive and effective team environment by sharing information with others, contributing to regular team meetings, asking questions, and staying apprised of the status of others' work
  • Has pride and a sense of accountability for the service quality, completeness, and resulting user experience
  • displays accountability and ownership of the data center facilities
What we offer
What we offer
  • Training and growth opportunities including Career Rotation Programs, Diversity & Inclusion training and events, and professional certifications
  • Fulltime
Read More
Arrow Right

Client Manager

A brilliant opportunity has arisen for an ambitious individual to be directly re...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
kantar.com Logo
Kantar
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong analytical skills and the ability to inspire clients through your insights
  • A track record of achieving high levels of client satisfaction
  • A growth mindset
  • A passion for consumer trends and the food industry
  • Be prepared to celebrate success!
Job Responsibility
Job Responsibility
  • Work with your Consumer Insight Director & Strategic Insight Director to draw up and oversee a strategy to drive the long-term success of our relationships
  • Deliver valuable insights to the client base and look to grow their use of our different analytical capabilities, working closely with the experts around the business
  • Lead client contract renewal discussions with support from the Strategic Insight Director
  • Play a role in the coaching and development of others
  • Collaborate with the highly successful Dairy New Business team to win new clients and ensure the quality of delivery makes them want to invest in future projects
What we offer
What we offer
  • 25 days holidays plus your birthday off!
  • Option of flexible hybrid working
  • Employee Assistance Programme
  • Private Medical & Dental Insurance
  • Eyecare Vouchers
  • Life Assurance & Income Protection
  • Cycle to Work Scheme
  • Fulltime
Read More
Arrow Right

Manager Measurement Lead

The AML Program Lead is responsible for servicing the Reddit account through the...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
kantar.com Logo
Kantar
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years advertising research / measurement experience with knowledge of experimentation and brand lift methodologies
  • 1+ years of people management or at least 6 months of experience in the AML program
  • 1+ years of client facing experience and a track record of success in client interactions, preferably in digital ad research, digital ad tech, at a research supplier, a media owner, brand, or agency partner
  • Exceptional written and oral communication skills – articulate and engaging, a data-driven storyteller that’s passionate about measurement to inspire and drive action using logical reasoning
  • Strong detail-orientation, process and time management skills, and capable of prioritizing and delivering against multiple/competing deadlines and collaborate cross-functionally
  • prior project management experience preferred
  • Natural curiosity, can-do attitude, and a track record of taking initiative to drive lasting change
  • Ability to handle tricky conversations with clients in a productive way that demonstrates your commitment to being a long-term partner and ability to navigate delicate situations, such as sharing constructive feedback
  • Undergraduate degree with coursework in marketing, business administration, economics, statistics, math, social sciences, or a related field
  • Proficient computer skills in Microsoft office and Google Suite tools
Job Responsibility
Job Responsibility
  • Execute brand lift studies from start to finish on the Reddit platform by leading kick-off calls, setting up new studies, designing surveys, analyzing data, writing final reports, and presenting results to internal and external teams
  • Oversee up to 5 AMLs and ensure team members meet their AML Program goals by maintaining accurate data tracking, performing quality checks to ensure accuracy across project work, cross project work, regularly providing feedback from team members and stakeholders to identify areas for growth, and taking action to enhance program efficiency
  • Create onboarding plans and supporting materials to fully onboard new hires, such as trainings on study methodology to guiding team members on how to effectively present Brand Lift results to clients with actionable recommendations that are tied to research objectives
  • Develop expertise in key advertiser verticals and pilot new research solutions for our main client as needed
  • Lead weekly internal meetings to cascade account updates, share learnings, and highlight new product/process changes that impact day-to-day work
  • Exhibit critical thinking to drive process improvements and address issues proactively alongside Kantar/Reddit leadership teams to constantly improve our client relationship
  • Closely partner with the AML Program Manager and North America Program Leads to identify knowledge gaps across team members, align on program needs, help escalate cross-functional feedback, and support the growth of our program YOY
  • Foster a positive work environment by recognizing team achievements, encouraging open communication, and promoting engagement
  • Parttime
Read More
Arrow Right

Locum General Dentist

Aya Locums has an immediate opening for a 14 week locum General Dentist job in E...
Location
Location
United States , Everett
Salary
Salary:
100.00 - 125.00 USD / Hour
ayalocums.com Logo
Aya Locums
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctor of Dental Surgery (DDS) or Doctor of Dental Medicine (DMD) degree from an accredited dental school
  • Active and unrestricted dental license in Washington
  • Current BLS certification
  • At least one year of experience in a dental practice
Job Responsibility
Job Responsibility
  • Conduct comprehensive oral health assessments and diagnose dental conditions
  • Develop and implement individualized treatment plans
  • Perform a variety of dental procedures, including fillings, extractions, root canals, and crown and bridge work
  • Provide preventive dental care, such as cleanings and oral hygiene education
  • Administer local anesthesia and nitrous oxide as needed
  • Collaborate with dental hygienists, dental assistants and other dental professionals
  • Maintain accurate and complete dental records
What we offer
What we offer
  • Access to top hospitals and healthcare systems in diverse care settings
  • Highly competitive, transparent locum tenens pay
  • Dedicated application and assignment support
  • In-house credentialing and licensing teams
  • Travel and lodging coverage
  • Easy timekeeping and streamlined management of documents
  • Malpractice coverage and risk management support
  • Fulltime
Read More
Arrow Right

General Dentist

Aya Locums has an immediate opening for a 1 week locum General Dentist job in Oc...
Location
Location
United States , Ochopee
Salary
Salary:
100.00 - 125.00 USD / Hour
ayalocums.com Logo
Aya Locums
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctor of Dental Surgery (DDS) or Doctor of Dental Medicine (DMD) degree from an accredited dental school
  • Active and unrestricted dental license in Florida
  • Current BLS certification
  • At least one year of experience in a dental practice
Job Responsibility
Job Responsibility
  • Conduct comprehensive oral health assessments and diagnose dental conditions
  • Develop and implement individualized treatment plans
  • Perform a variety of dental procedures, including fillings, extractions, root canals, and crown and bridge work
  • Provide preventive dental care, such as cleanings and oral hygiene education
  • Administer local anesthesia and nitrous oxide as needed
  • Collaborate with dental hygienists, dental assistants and other dental professionals
  • Maintain accurate and complete dental records
What we offer
What we offer
  • Access to top hospitals and healthcare systems in diverse care settings
  • Highly competitive, transparent locum tenens pay
  • Dedicated application and assignment support
  • In-house credentialing and licensing teams
  • Travel and lodging coverage
  • Easy timekeeping and streamlined management of documents
  • Malpractice coverage and risk management support
  • Fulltime
Read More
Arrow Right

Locum General Dentist

Aya Locums has an immediate opening for a 3 week locum General Dentist job in Na...
Location
Location
United States , Nashville
Salary
Salary:
100.00 - 125.00 USD / Hour
ayalocums.com Logo
Aya Locums
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctor of Dental Surgery (DDS) or Doctor of Dental Medicine (DMD) degree from an accredited dental school
  • Active and unrestricted dental license in Tennessee
  • Current BLS certification
  • At least one year of experience in a dental practice
  • Strong clinical knowledge and dental assessment skills
  • Excellent manual dexterity and hand-eye coordination
  • Effective communication and interpersonal skills
  • Proficiency in using dental equipment and technology
  • Ability to manage dental emergencies effectively
  • Time management skills
Job Responsibility
Job Responsibility
  • Conduct comprehensive oral health assessments and diagnose dental conditions
  • Develop and implement individualized treatment plans
  • Perform a variety of dental procedures, including fillings, extractions, root canals, and crown and bridge work
  • Provide preventive dental care, such as cleanings and oral hygiene education
  • Administer local anesthesia and nitrous oxide as needed
  • Collaborate with dental hygienists, dental assistants and other dental professionals
  • Maintain accurate and complete dental records
What we offer
What we offer
  • Access to top hospitals and healthcare systems in diverse care settings
  • Highly competitive, transparent locum tenens pay
  • Dedicated application and assignment support
  • In-house credentialing and licensing teams
  • Travel and lodging coverage
  • Easy timekeeping and streamlined management of documents
  • Malpractice coverage and risk management support
  • Fulltime
Read More
Arrow Right