Model Behavior Engineer Job at Notion (New York)

Product Manager, API Model Behavior

As a Model Behavior Product Manager for the API team, you'll be at the forefront...

Location

United States , San Francisco

Salary:

293000.00 - 385000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

5+ years of product management or related industry experience
Experience collaborating directly and deeply with high-growth startups
Proven track record of building for developers, with strong intuition for designing clear, flexible APIs and primitives that scale from early experimentation to production use
Hands-on experience driving consensus and action in ambiguous spaces
Excel at collaborating across diverse teams and communicating complex ideas clearly

Job Responsibility

Define strategic priorities and roadmap for improving model behavior for API users, focusing on user outcomes, safety, reliability, and emerging capabilities
Partner with research and engineering teams at a technical level to translate those goals into model capability improvements
Partner with cross-functional teams to launch OpenAI’s frontier models in the API, and expose their full capabilities to users via flexible and powerful API primitives
Develop scalable methodologies, tools, and processes for evaluating, tuning, and iterating on model behavior
Synthesize user research, community feedback, and quantitative insights into targeted improvements in our AI models
Establish and iterate on clear, actionable metrics that accurately reflect model quality and user experience at scale

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

Product Manager, Model Behavior

As a Product Manager for the Model Behavior team, you'll be at the forefront of ...

Location

United States , San Francisco

Salary:

230000.00 - 325000.00 USD / Year

OpenAI

Expiration Date

Until further notice

Requirements

6+ years of product management or related industry experience
Interest in fields such as human-computer interaction, psychology, philosophy, or other relevant fields
Excitement about building not just a product, but a new form of intelligence, with the aim to benefit humanity
Hands-on experience driving consensus and action in ambiguous spaces
Know how to ask questions that uncover underlying constraints and assumptions
Excel at collaborating across diverse teams and communicating complex ideas clearly

Job Responsibility

Define strategic priorities and roadmap for improving model behavior, focusing on user outcomes, safety, reliability, and emerging capabilities
Partner closely with research, engineering, product design, and policy teams to translate strategic goals into actionable product initiatives
Develop scalable methodologies, tools, and processes for evaluating, tuning, and iterating on model behavior
Synthesize user research, community feedback, and quantitative insights into targeted improvements in our AI models
Establish and iterate on clear, actionable metrics that accurately reflect model quality and user experience at scale

What we offer

Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
401(k) retirement plan with employer match
Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
Mental health and wellness support
Employer-paid basic life and disability coverage
Annual learning and development stipend to fuel your professional growth
Daily meals in our offices, and meal delivery credits as eligible

Fulltime

Model Behavior Architect

We're looking for a Model Behavior Architect to help build Perplexity's AI produ...

Location

United States , San Francisco

Salary:

180000.00 - 260000.00 USD / Year

Perplexity

Expiration Date

Until further notice

Requirements

Experience designing evaluations, benchmarks, or metrics for AI systems
Strong written and verbal communication skills, particularly in explaining complex concepts to diverse stakeholders
Ability to manage multiple concurrent projects in a fast-moving environment
Strong experience with Perplexity or other frontier AI models in production settings
Demonstrated experience with Python — you'll prototype, debug, automate, and build systems at scale
3+ years of experience working with LLMs in a product or research setting

Job Responsibility

Context Engineering: Design, test, and optimize context strategies and system prompts that shape answer engine behavior across products, features, and use cases
Evaluation Systems: Build automated and semi-automated evaluation pipelines that measure model quality, catch regressions, and scale across product surfaces
Model Launch Support: Partner with research and engineering to validate model behavior before and during rollouts, ensuring smooth transitions with no degradation
Research & Analysis: Identify inconsistencies and failure modes in model outputs through well-designed research projects — for both internal and production-facing systems
Cross-functional Collaboration: Work closely with design, product, and research teams to translate product goals into concrete model behavior requirements
Knowledge Sharing: Help engineers across teams build intuition for prompt design, context engineering, and evaluation best practices
Staying Current: Track the latest alignment, evaluation, and prompting techniques from industry and academia, and bring the best ideas back to the team

What we offer

equity
health
dental
vision
retirement
fitness
commuter and dependent care accounts

Fulltime

Software Engineer Ii, Behavior Planning Ml Platform

Aurora’s mission is to deliver the benefits of self-driving technology safely, q...

Location

United States , Pittsburgh

Salary:

126000.00 - 201000.00 USD / Year

Aurora Innovation

Expiration Date

Until further notice

Requirements

BS or higher degree in Computer Science/Engineering or related fields. > 6 months of experience
Strong programming skills in C++ or Python, ideally both
Experience with machine learning frameworks (PyTorch or TensorFlow)
Solid foundation in computer science fundamentals - especially operating system concepts including concurrency, memory management and process scheduling.

Job Responsibility

Develop large scale pipelines for data extraction, model training and model evaluation
Build and optimize onboard ML infrastructure used to deploy models and run inference onboard the vehicle
Collaborate closely with motion planning, systems engineering, and other autonomy groups to define and develop critical ML workflow requirements.

Senior Low Observables Design & Integration Engineer - Pole Model

At Boeing, we innovate and collaborate to make the world a better place. We’re c...

Location

United States , Berkeley

Salary:

Not provided

Boeing

Expiration Date

Until further notice

Requirements

Bachelor of Science degree in Engineering (with a focus in Electrical, Mechanical or Aeronautical), Computer Science, Data Science, Mathematics, Physics, Chemistry or non-US equivalent qualifications directly related to the work statement
Professional experience working with low observable materials and technologies, including hands-on exposure to LO integration, design, manufacturing, and test
Demonstrated experience using computational electromagnetic solvers (e.g., FEKO, HFSS, CST, WIPL-D, CARLOS, SENTRI, XPATCH or equivalent) for design and optimization of LO systems and RCS predictions
Strong understanding of electromagnetic principles relevant to scattering, phenomena, and Radar Cross Section
Proven track record supporting fabrication and testing of LO components/assemblies and correlating measurements to models
Proficiency in data processing and analysis of RCS/EM test data, including calibration, clutter and background rejection, and data visualization techniques
Excellent technical writing skills with experience producing engineering reports, test plans, and test reports
Active U.S. Top Secret Security Clearance
Ability to obtain Special Program Access (U.S. Only Citizenship required)

Job Responsibility

Lead detailed LO design and integration for the RCS pole model, including material selection, incorporation of advanced technologies, supplier hardware integration
Use computational electromagnetic solvers to model, analyze, and optimize radar cross section (RCS) and scattering behavior across required frequency bands and aspect angles in support of pre-test predictions and diagnostics
Work with Manufacturing to ensure proper alignment between design, analysis and fabrication of LO components
Define and support fabrication processes, QA checks, and build plans for LO skins, coatings, RAM treatments, and attachments
identify and mitigate manufacturability risks, provide LO liaison support to the shop
Develop and execute test plans for RCS characterization (anechoic chamber and outdoor ranges), including instrumentation, calibration, and measurement repeatability considerations
Prepare and execute data processing workflows to reduce, calibrate, and analyze measured RCS and related EM test data
combine simulation and measurement data for validation and design iteration
Produce clear technical documentation: design descriptions, analysis reports, test plans, test reports, procedures, and presentation materials for program reviews
Mentor junior engineers and support continuous improvement of LO design, test, and data processing practices

What we offer

Best in class 401(k) plan: match contributions dollar for dollar, up to 10% of eligible pay with Immediate 100% vesting
Student Loan Match
health insurance
flexible spending accounts
health savings accounts
retirement savings plans
life and disability insurance programs
paid and unpaid time away from work
Potential signing bonus for eligible/qualified external candidates
Relocation based on candidate eligibility

Fulltime

Software Engineer, Autonomy Behavior Validation

As a Software Engineer on the Software Validation team within the AV organizatio...

Location

United States , Sunnyvale

Salary:

123200.00 - 189100.00 USD / Year

General Motors

Expiration Date

Until further notice

Requirements

Master's degree in Computer Science, Software Engineering, Data Science, or related fields
1–3 years of professional software engineering experience (including internships, co-ops, or research engineering roles) building automation, internal tools, or data/analysis pipelines
Using large language models (LLMs) to summarize results, generate reports, or accelerate analysis
Building simple agents or scripts that chain tools together to complete tasks end-to-end
Strong programming skills in Python and experience with SQL
Experience writing clean, well-tested, and maintainable code for data processing, backend services, or scientific/analytical workflows
Experience working with large datasets to derive insights, build analyses, or drive decisions
Strong analytical thinking skills with the ability to interpret data and derive impactful conclusions
Ability to adapt and operate under ambiguity, going from quick code prototypes to longer-term, production-ready solutions on brief time horizons
Excellent communication skills, capable of switching between high-level and detailed technical discussions

Job Responsibility

Design and deploy metrics and test strategies at scale to evaluate the behavior of autonomous vehicles in simulation and on-road
Translate validation strategies into production-quality code and automation pipelines that execute high-quality AV behavior analysis for continuous and scaled software release cycles
Leverage AI-assisted and agentic workflows to build internal tools and frameworks that make it easier to author, configure, and deploy metrics, tests, and validation artifacts
Ensure the quality and reliability of behavior validation outputs through monitoring, alerting, automated checks, and continuous improvement of the underlying code and data pipelines
Collaborate across teams to establish coding and automation best practices for the Software Validation organization
and understand stakeholder needs and translate them into robust tools and workflows

What we offer

Bonus Potential: An incentive pay program offers payouts based on company performance, job level, and individual performance
medical
dental
vision
Health Savings Account
Flexible Spending Accounts
retirement savings plan
sickness and accident benefits
life insurance
paid vacation & holidays

Fulltime

AI Systems Engineer – AI Model (Training & Inference)

The AMD AI Group is looking for a Senior Software Development Engineer to own th...

Location

Canada , Markham

Salary:

106400.00 - 159600.00 CAD / Year

AMD

Expiration Date

Until further notice

Requirements

Industry experience shipping production AI/ML infrastructure, with hands-on work spanning both training and inference.
Bachelor’s or Master’s degree or Ph.D in Computer/Software Engineering, Computer Science, or related technical discipline

Job Responsibility

Enable and optimize large-scale model training (LLMs, VLMs, MoE architectures) on AMD Instinct GPU clusters, ensuring correctness, reproducibility, and competitive throughput.
Build and maintain training infrastructure: job orchestration, distributed checkpointing, data loading pipelines, and storage optimization for multi-thousand GPU clusters on Kubernetes.
Debug and resolve training-specific issues including gradient norm explosions, non-deterministic behavior across GPU generations, and compute-communication overlap in distributed training (FSDP, DeepSpeed, Megatron-LM).
Optimize RCCL collective communication patterns for training workloads, including all-reduce, all-gather, and reduce-scatter across multi-node topologies.
Develop monitoring, alerting, and compliance infrastructure to ensure training cluster health, data security, and SLA adherence at scale.
Design and build end-to-end validation and testing infrastructure using proxy workloads, synthetic benchmarks, and configurable workload generators to systematically validate platform readiness across AMD Instinct GPU generations.
Write and optimize high-performance GPU kernels (GEMM, attention, quantized matmul, GPTQ/AWQ) in HIP, Triton, and MLIR targeting AMD Instinct architectures, with demonstrated ability to outperform open-source baselines.
Drive end-to-end inference enablement on new AMD GPU silicon - be among the first to get frontier models running on each new Instinct generation, creating reproducible guides and reference implementations.
Optimize inference serving frameworks (vLLM, SGLang, TorchServe) for AMD GPUs: batching strategies, KV-cache management, speculative decoding, and continuous batching for production throughput/latency targets.
Develop novel approaches to inference acceleration, including bio-inspired algorithms, SLM-assisted batching, and custom scheduling strategies that exploit AMD hardware characteristics.

Fulltime

Sr Staff AV Behavior Safety Engineer

The Safety Assurance for Effective Autonomous Driving Software (SAFE‑ADS) depart...

Location

United States

Salary:

185100.00 - 284100.00 USD / Year

General Motors

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, Engineering, Mathematics, or a related field
6+ years of experience in machine learning, engineering, data science, or a related field
6+ years in autonomous vehicle or robotics development or related field
Demonstrated experience working on production‑intent AV programs
Track record providing technical safety leadership in AV development (e.g., defining safety strategies, risk assessments, validation methodologies, safety case contributions)
Deep understanding of AV behavior development: defining ODDs, behaviors, and evaluation criteria
analyzing simulation, closed‑course, and public‑road test data
and generating prioritized, actionable recommendations for developers
Experience applying AV safety standards and best practices, such as ISO 5083, ISO 21448 (SOTIF), and AVSC practices
Excellent communication and storytelling skills, including the ability to explain complex technical tradeoffs to executives and non‑technical stakeholders

Job Responsibility

Lead the strategy and support execution for how GM defines, measures, and validates the safety of SAE Level 3 – 4 Automated Driving Systems powered by machine‑learned models
Reference and interpret standards such as ISO 21448 (SOTIF), ISO 5083, and AVSC best practices to define GM’s strategy for safe autonomous system development, validation and deployment
Own the behavior‑focused portion of the ADS Safety Case, including key claims, sufficiency criteria, and recommended evidence for AV behavior safety performance
Collaborate with Software Validation, Embodied AI, Simulation, and Safety Metrics teams to define the end‑to‑end AV behavior validation methodology for AI‑driven systems
Set the strategy for how we systematically break down ODDs and how performance is validated per behavior and in aggregate
Collaborate on evaluation metrics, human benchmarks, and safety launch targets for AV behaviors and overall system performance
This includes supporting development of safety performance indicators (SPIs) for AV behaviors
Assess AV performance across safety and reliability dimensions using simulation, closed‑course, and public‑road data and provide clear, prioritized feedback to engineering teams
Define and run an assurance process to verify the sufficiency criteria and safety targets to support launch readiness

What we offer

Medical, dental, vision, Health Savings Account, Flexible Spending Accounts, retirement savings plan, sickness and accident benefits, life insurance, paid vacation & holidays, tuition assistance programs, employee assistance program, GM vehicle discounts
Company vehicle evaluation program
Bonus Potential: An incentive pay program offers payouts based on company performance, job level, and individual performance

Fulltime

Select Country

Model Behavior Engineer

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?