Senior Research Engineer, Model Evaluation Job at Cohere (Toronto)

Senior Research Engineer, LLM Evaluation and Behavioral Analysis

Together AI is building the fastest, most capable open-source-aligned LLMs and i...

Location

United States , San Francisco

Salary:

220000.00 - 270000.00 USD / Year

Together AI

Expiration Date

Until further notice

Requirements

Strong engineering skills with Python, evaluation tooling, and distributed workflows
Experience working with LLMs or transformer-based models, particularly in model evaluation, testing, or red-teaming
Ability to reason clearly about qualitative behavior, edge cases, and model failure patterns
Experience designing experiments, building datasets, and interpreting noisy behavioral signals
Understanding of function calling and structured output formats
Familiarity with GPU or distributed compute environments
Hands-on experience evaluating function-calling models, agentic systems, or tool-augmented LLM pipelines
Experience with multi-turn or multi-step reasoning tasks
Familiarity with inference systems, distributed infrastructure, or post-training workflows
Passion for discovering subtle behaviors, surprising model gaps, or edge-case failures

Job Responsibility

Build and iterate on evaluation frameworks that measure model performance across instruction following, function calling, long-context reasoning, multi-turn dialog, safety, and agentic behaviors
Develop specialized evaluation suites for: Function calling — argument correctness, schema adherence, tool selection, multi-function planning, and error recovery
Agentic workflows — task decomposition, multi-step planning, self-correction, and autonomous tool-use sequences
Tool-augmented interactions — search, retrieval, code execution, API-driven actions
Create CI/CD automated pipelines for A/B comparisons, regression detection, behavioral drift monitoring, and adversarial probing
Design and curate high-quality evaluation datasets, especially nuanced or challenging cases across domains
Collaborate with researchers and engineers to diagnose failures, triage regressions, and guide data selection, shaping strategies, objective design, and system improvements
Work with engineering teams to build dashboards, reports, and internal tools that help visualize behavior changes across releases
Operate in a fast-paced, high-impact environment with deep technical ownership and close partnership with world-class model researchers and infra engineers

What we offer

competitive compensation
startup equity
health insurance
other benefits

Fulltime

Senior Research Scientist, Model Evaluation

Evaluation is critical to making progress in scaling intelligence. As models con...

Location

United States; Canada; United Kingdom , Toronto; New York; Seattle; San Francisco; London; Paris

Salary:

Not provided

Cohere

Expiration Date

Until further notice

Requirements

Enjoy rapidly building prototypes that demonstrate the boundaries of what LLMs are capable of
Have developed resources to measure LLM capabilities
Have spent dozens of hours reviewing complex data and LLM outputs to ensure high data quality
Obsessive about rigorously measuring AI capabilities and ensuring measurements align with the capabilities you care about
Have strong software engineering skills

Job Responsibility

Create ambitious new evaluation benchmarks that push the limits of what our models can accomplish
Work on highly cross-functional teams to translate model feedback into trustworthy, repeatable evaluations
Conduct research to advance the state-of-the-art in LLM evaluation methods, including training LLM judges
refining LLM-based data synthesis pipelines
and improving evaluation efficiency
Build scalable and reusable tools for digging into model performance

What we offer

An open and inclusive culture and work environment
Work closely with a team on the cutting edge of AI research
Weekly lunch stipend, in-office lunches & snacks
Full health and dental benefits, including a separate budget to take care of your mental health
100% Parental Leave top-up for up to 6 months
Personal enrichment benefits towards arts and culture, fitness and well-being, quality time, and workspace improvement
Remote-flexible, offices in Toronto, New York, San Francisco, London and Paris, as well as a co-working stipend
6 weeks of vacation (30 working days!)

Fulltime

Senior Research Engineer

As a Senior Research Engineer at Microsoft, you will advance Microsoft’s mission...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Proficiency in Python and at least one deep learning framework such as PyTorch, JAX, or TensorFlow
Experience deploying Fine Tuned LLMs or multimodal models in live production environments
Experience shipping and maintaining production AI systems
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML
Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments
Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
Drive original research and thought leadership (whitepapers, internal notes, patents)
convert insights into shipped capabilities
Research Translation: Continuously review emerging work
identify high-potential methods and adapt them to Microsoft problem spaces
ML Design & Architecture: Own end-to-end pipeline from data prep, training, evaluation, deployment, and feedback loops
Identify and resolve model quality gaps, latency issues, and scale bottlenecks using PyTorch, or TensorFlow
Operate CI/CD and MLOps workflows including model versioning, retraining, evaluation, and monitoring

Fulltime

Senior Research Engineer

As a Senior Research Engineer at Microsoft, you will help advance Microsoft’s mi...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
OR equivalent experience
Proficiency in Python and at least one deep learning framework such as PyTorch, JAX, or TensorFlow
Experience deploying Fine Tuned LLMs or multimodal models in live production environments
Experience shipping and maintaining production AI systems
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Build AI-First Contact Center Experiences
Bringing State-of-the-Art Research to Products
Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML
Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments
Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
Drive original research and thought leadership (whitepapers, internal notes, patents)
convert insights into shipped capabilities
Research Translation: Continuously review emerging work
identify high-potential methods and adapt them to Microsoft problem spaces
Partner with product teams to improve customer and agent outcomes

Fulltime

Senior Research Engineer - Dynamics 365 Contact Center

We are looking for a Senior Research Engineer to join our team. As a Senior Rese...

Location

Czech Republic , Prague

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND related experience (e.g., statistics predictive analytics, research)
OR Master's Degree in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND related experience (e.g., statistics, predictive analytics, research)
OR Doctorate in Statistics, Econometrics, Computer Science, Electrical or Computer Engineering, or related field AND related experience (e.g., statistics, predictive analytics, research)
OR equivalent experience
Experience developing and deploying live production systems, as part of a product team
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Build collaborative relationships with product and business groups to deliver AI-driven impact
Research and implement state-of-the-art using foundation models, prompt engineering, RAG, graphs, multi-agent architectures, as well as classical machine learning techniques
Fine-tune foundation models using domain-specific datasets
Evaluate model behavior on relevance, bias, hallucination, and response quality via offline evaluations, shadow experiments, online experiments, and ROI analysis
Build rapid AI solution prototypes, contribute to production deployment of these solutions, debug production code, support MLOps/AIOps
Contribute to papers, patents, and conference presentations
Translate research into production-ready solutions and measure their impact through A/B testing and telemetry that address customer needs
Ability to use data to identify gaps in AI quality, uncover insights and implement PoCs to show proof of concepts
Demonstrate deep expertise in AI subfields (e.g., deep learning, Generative AI, NLP, muti-modal models) to translate cutting-edge research into practical, real-world solutions that drive product innovation and business impact
Share insights on industry trends and applied technologies with engineering and product teams

Fulltime

Senior Research Engineer

As a Senior Research Engineer at Microsoft, you will advance Microsoft’s mission...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor’s degree in Computer Science, Engineering, Mathematics, Statistics, Physics, or a related field and 4 or more years in applied ML or AI research and product engineering
OR Master’s degree and 3 or more years in applied ML or AI research and product engineering
OR PhD in a relevant field and 2 or more years with generative AI, LLMs, or related ML algorithms
Ability to meet Microsoft, customer and/or government security screening requirements
Microsoft Cloud Background Check upon hire/transfer and every two years thereafter

Job Responsibility

Bringing State-of-the-Art Research to Products
Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML
Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments
Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
Drive original research and thought leadership (whitepapers, internal notes, patents)
convert insights into shipped capabilities
Research Translation: Continuously review emerging work
identify high-potential methods and adapt them to Microsoft problem spaces
End-to-End System Development
ML Design & Architecture: Own end-to-end pipeline from data prep, training, evaluation, deployment, and feedback loops

Fulltime

Senior Research Engineer

As a Senior Research Engineer at Microsoft, you will advance Microsoft’s mission...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
3+ years of experience using Python and at least one deep learning framework such as PyTorch, JAX, or TensorFlow
Experience deploying Fine Tuned LLMs or multimodal models in live production environments
Experience shipping and maintaining production AI systems
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Bringing State-of-the-Art Research to Products: Design and implement AI systems using foundation models, prompt engineering, retrieval-augmented generation, multi-agent architectures, and classic ML
Fine-tune large language models on domain-specific data and evaluate via offline and online methods such as A/B testing, telemetry, and shadow deployments
Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
Drive original research and thought leadership (whitepapers, internal notes, patents)
convert insights into shipped capabilities
Research Translation: Continuously review emerging work
identify high-potential methods and adapt them to Microsoft problem spaces
End-to-End System Development: ML Design & Architecture: Own end-to-end pipeline from data prep, training, evaluation, deployment, and feedback loops
Identify and resolve model quality gaps, latency issues, and scale bottlenecks using PyTorch, or TensorFlow
Operate CI/CD and MLOps workflows including model versioning, retraining, evaluation, and monitoring

Fulltime

Senior Research Engineer

As a Research Engineer at Microsoft, you will set the technical vision and lead ...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
Proven track record leading large-scale AI systems and cross-org initiatives that shipped
Solid software engineering foundations and hands-on depth in Python plus deep-learning frameworks (PyTorch/ TensorFlow) and modern MLOps/tooling
Experience shipping and maintaining production AI systems
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Architect and deliver complex AI systems across model development, data, infra, evaluation, and deployment spanning multiple product lines
Set technical direction for large programs
drive alignment across Research, Engineering, and Product
Build and harden prototypes into production-ready services using robust software engineering and MLOps practices
Integrate LLMs, multimodal models, multi-agent architectures, and RAG into Microsoft’s ecosystem
Establish best practices for MLOps, governance, and Responsible AI, compliant with Microsoft principles and industry standards
Drive original research and thought leadership (whitepapers, internal notes, patents)
convert insights into shipped capabilities
Research Translation: Continuously review emerging work
identify high-potential methods and adapt them to Microsoft problem spaces

Fulltime

Select Country

Senior Research Engineer, Model Evaluation

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?

Senior Research Engineer, Model Evaluation

Senior Research Engineer, LLM Evaluation and Behavioral Analysis

Senior Research Scientist, Model Evaluation

Senior Research Engineer

Senior Research Engineer

Senior Research Engineer - Dynamics 365 Contact Center

Senior Research Engineer

Senior Research Engineer

Senior Research Engineer

Our AI answers in your language