CrawlJobs Logo

Model Evaluation QA Lead

deepgram.com Logo

Deepgram

Location Icon

Location:
United States

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

180000.00 - 230000.00 USD / Year

Job Description:

As Model Evaluation QA Lead, you’ll be the technical owner of model quality assurance across Deepgram’s AI pipeline—from pre-training data validation and provenance through post-deployment monitoring. Reporting to the QA Engineering Manager, you will partner directly with our Active Learning and Data Ops teams to build and operate the evaluation infrastructure that ensures every model Deepgram ships meets objective quality bars across languages, domains, and deployment contexts. This is a hands-on, high-impact role at the intersection of QA engineering and ML operations. You will design automated evaluation frameworks, integrate model quality gates into release pipelines, and drive industry-standard benchmarking—ensuring Deepgram maintains its position as the accuracy and latency leader in voice AI.

Job Responsibility:

  • Model Evaluation Automation: Design, build, and maintain automated model evaluation pipelines that run against every candidate model before release
  • Release Gate Integration: Embed model quality checkpoints into CI/CD and release pipelines
  • Agent & Model Evaluation Frameworks: Stand up and operate evaluation tooling for end-to-end voice agent testing
  • Active Learning & Data Ingestion Testing: Partner with the Active Learning team to validate data ingestion infrastructure, annotation pipelines, and retraining automation
  • Industry Benchmark Automation: Automate execution and reporting of industry-standard benchmarks
  • Language & Domain Validation: Build and maintain test suites for multi-language and domain-specific model validation
  • Retraining Automation Support: Validate the end-to-end retraining pipeline across all data sources
  • Manual Test Feedback Loop: Design and operate human-in-the-loop evaluation workflows for subjective quality assessment

Requirements:

  • 4–7 years of experience in QA engineering, ML evaluation, or a related technical role with a focus on predictive and generative model and data quality
  • Hands-on experience building automated test/evaluation pipelines for ML models and connecting software features
  • Strong programming skills in Python
  • experience with ML evaluation libraries, data processing frameworks (Pandas, NumPy), and scripting for pipeline automation
  • Familiarity with speech/audio ML concepts: WER, SER, MOS, acoustic models, language models, or similar evaluation metrics
  • Experience with CI/CD integration for ML workflows (e.g., GitHub Actions, Jenkins, Argo, MLflow, or equivalent)
  • Ability to design and maintain reproducible benchmark environments across multiple model versions and configurations
  • Strong communication skills—you can translate model quality metrics into actionable insights for engineering, research, and product stakeholders
  • Detail-oriented and systematic, with a bias toward automation over manual process

Nice to have:

  • Experience with model evaluation platforms (Coval, Braintrust, Weights & Biases, or custom evaluation harnesses)
  • Background in speech recognition, NLP, or audio processing domains
  • Experience with distributed evaluation at scale—running evals across GPU clusters or large dataset partitions
  • Familiarity with human-in-the-loop evaluation design and annotation pipeline tooling
  • Experience with multi-language model evaluation and localization quality assurance
  • Prior work in a company where ML model quality directly impacted revenue or customer SLAs
What we offer:
  • Medical, dental, vision benefits
  • Annual wellness stipend
  • Mental health support
  • Life, STD, LTD Income Insurance Plans
  • Unlimited PTO
  • Generous paid parental leave
  • Flexible schedule
  • 12 Paid US company holidays
  • Quarterly personal productivity stipend
  • One-time stipend for home office upgrades
  • 401(k) plan with company match
  • Tax Savings Programs
  • Learning / Education stipend
  • Participation in talks and conferences
  • Employee Resource Groups
  • AI enablement workshops / sessions
  • Offers Equity
  • Offers Bonus
  • 10% annual bonus

Additional Information:

Job Posted:
February 18, 2026

Employment Type:
Fulltime
Work Type:
Remote work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 31694 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Model Evaluation QA Lead

Manager, Support Quality

As the Support Quality Manager at Replit, you’ll build and lead the program that...
Location
Location
United States , Foster City
Salary
Salary:
140000.00 - 175000.00 USD / Year
replit.com Logo
Replit
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in Support Quality, Support Operations, Technical Support, or similar roles in a technology company
  • 2+ years in a people management or team leadership capacity
  • Experience building or significantly evolving a QA program, framework, or evaluation system
  • Strong understanding of customer support workflows, ticket lifecycle, and escalation patterns
  • Experience working with support platforms (Zendesk or similar) and QA tooling or review workflows
  • Strong analytical mindset with experience using data to identify trends and drive performance improvements
  • Experience working cross-functionally with Support leadership, Operations, and Enablement or Training teams
  • Strong written and verbal communication skills, including delivering structured performance feedback and coaching guidance
  • Experience working in fast-moving product environments with frequent releases and evolving workflows
  • Hands-on experience using AI tools (e.g., Replit, Claude, ChatGPT, or similar) to improve workflows, knowledge creation, training, or support operations
Job Responsibility
Job Responsibility
  • Build and lead the Support QA program, including evaluation frameworks, scoring models, review workflows, and calibration processes
  • Hire, develop, and manage QA specialists or analysts as the program scales
  • Define quality standards across ticket support, technical troubleshooting, and customer communication
  • Establish QA coverage strategy across FTE and vendor support teams
  • Lead calibration programs to ensure consistent quality standards across reviewers, teams, and regions
  • Partner with Learning & Knowledge to turn QA insights into training, onboarding improvements, and coaching strategies
  • Partner with Support Operations to embed quality signals into dashboards, reporting, and performance frameworks
  • Define and evolve quality standards for AI-assisted support, including agent assist usage, automation handoffs, and AI-generated content quality
  • Utilize Replit to build internal tooling and recommend external tooling when necessary to improve QA workflows and program scalability
  • Define and track key quality metrics (QA trends, CSAT correlation, escalation rate, escalation rate, repeat contact rate, policy adherence) and report insights to Support leadership
What we offer
What we offer
  • Competitive Salary & Equity
  • 401(k) Program with a 4% match
  • Health, Dental, Vision and Life Insurance
  • Short Term and Long Term Disability
  • Paid Parental, Medical, Caregiver Leave
  • Commuter Benefits
  • Monthly Wellness Stipend
  • Autonomous Work Environment
  • In Office Set-Up Reimbursement
  • Flexible Time Off (FTO) + Holidays
  • Fulltime
Read More
Arrow Right

Manager, Content Engineering — AI Content Understanding

Product Content Engineering is a horizontal function supporting initiatives acro...
Location
Location
United States , Menlo Park
Salary
Salary:
162000.00 - 227000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in content strategy, content operations, AI evaluation, or a related field
  • 2+ years of people management experience, including hiring, developing, and performance-managing direct reports
  • Experience managing cross-functional programs with engineering, data science, and product partners in fast-paced environments
  • 1+ years working with generative AI products, AI evaluation, prompt engineering, annotation, and/or content labeling and analysis
  • Experience designing or operationalizing evaluation frameworks, annotation guidelines, or quality rubrics for AI/ML systems
  • Demonstrated ability to manage multiple concurrent workstreams with competing priorities and tight deadlines
  • Proven analytical skills with experience interpreting evaluation data and communicating findings to technical and non-technical audiences
  • Track record of building team operational processes and quality standards from the ground up or during periods of significant change
Job Responsibility
Job Responsibility
  • Manage and develop a team of Content Engineers and contingent workers, setting clear goals, providing regular feedback, and supporting career growth
  • Own the execution of continuous CU model evaluations — coordinating sprint planning, reviewer assignments, QA processes, and delivery timelines across multiple concurrent workstreams
  • Drive the creation and maintenance of golden datasets that serve as ground truth for model benchmarking and auto-eval calibration
  • Partner with engineering, data science, and product teams to translate evaluation insights into actionable recommendations for model improvement and prompt optimization
  • Lead the team's contribution to LLM-as-a-Judge (auto-eval) initiatives — ensuring human evaluation data is used to calibrate, validate, and improve automated evaluation systems
  • Define and maintain evaluation guidelines, rubrics, and quality standards in partnership with Lead Content Engineers, ensuring consistency across reviewers and use cases
  • Build repeatable operational processes for evaluation sprints, including reviewer training, calibration sessions, and escalation workflows
  • Manage CW workforce planning — hiring, onboarding, allocation across workstreams, and performance management
  • Synthesize evaluation results into structured reports and present findings to cross-functional leadership, including engineering leads and lead product stakeholders
  • Identify and mitigate operational risks — staffing gaps, timeline conflicts, quality regressions — before they impact delivery
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Senior Manager - Quality Center of Excellence

We are seeking a seasoned Senior Manager – Quality Center of Excellence (CoE) to...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
alterdomus.com Logo
Alter Domus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12–18+ years of experience in Quality Assurance or Quality Engineering
  • Financial Services experience a plus
  • Proven experience building or leading a QA / Quality Engineering CoE
  • Strong expertise in automation strategies, CI/CD testing, and quality analytics
  • Strong leadership, communication, and influencing skills
  • Analytical, outcome-driven mindset
Job Responsibility
Job Responsibility
  • Establish and lead the Quality Center of Excellence (CoE) as the governing body for quality standards, frameworks, and metrics
  • Define a standardized QA operating model across manual, automation, performance, unit, and non-functional testing
  • Drive Shift-Left and Shift-Right testing strategies across delivery teams
  • Define engagement models between CoE and delivery teams
  • Ensure alignment with organizational goals for speed, scalability, cost efficiency, and risk reduction
  • Own the enterprise QA tooling strategy including test management, automation, performance, and defect tracking tools
  • Evaluate, standardize, and rationalize QA tools to improve ROI
  • Integrate QA tooling with CI/CD pipelines in partnership with DevOps teams
  • Promote automation-first and reusable framework approaches
  • Define enterprise-wide quality KPIs including defect leakage, coverage, automation ROI, cycle time, and escape rates
What we offer
What we offer
  • Support for professional accreditations such as ACCA and study leave
  • Flexible arrangements, generous holidays, plus an additional day off for your birthday
  • Continuous mentoring along your career progression
  • Active sports, events and social committees across our offices
  • 24/7 support available from our Employee Assistance Program
  • The opportunity to invest in our growth and success through our Employee Share Plan
  • Plus additional local benefits depending on your location
Read More
Arrow Right

Staff/Senior Test Consultant (SDET)

10Pearls is seeking a Senior QA Engineer (SDET) - Platform Quality & Automation ...
Location
Location
Pakistan , Islamabad
Salary
Salary:
Not provided
10pearls.com Logo
10Pearls
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Engineering, or a related field (preferred)
  • 5–8 years of experience in SDET or test automation roles, with ownership of test strategy
  • Strong proficiency in Python (pytest, fixtures, parametrization, plugins) or equivalent automation stack
  • Proven experience designing automation frameworks used across multiple teams
  • Hands-on experience with contract testing (Pact, Spring Cloud Contract) and API testing (REST, GraphQL, gRPC)
  • Experience with performance testing tools such as k6, Locust, JMeter, or Gatling
  • Strong understanding of Kubernetes-based testing environments (ephemeral environments, Helm deployments)
  • Experience integrating testing into CI/CD pipelines (Azure DevOps – ADO), including quality gates, automation, and reporting
  • Understanding of the Azure ecosystem and its role in test environments, pipelines, and deployments
  • Excellent communication skills for test planning, documentation, and reporting
Job Responsibility
Job Responsibility
  • Define and implement the platform-wide test strategy, integration, contract, end-to-end, performance, chaos, and security testing
  • Design and maintain a pytest-based automation framework, including reusable fixtures, data factories, and environment setups
  • Develop end-to-end test suites covering cross-system workflows (data ingestion → processing → AI systems → governance → audit)
  • Own and implement contract testing (Pact or equivalent) to ensure service compatibility across teams
  • Design and execute performance testing (k6, Locust, JMeter), ensuring scalability and system reliability
  • Define and enforce quality gates in CI/CD pipelines (Azure DevOps – ADO), including merge and release criteria
  • Manage test stability, including flaky test detection, triaging, and resolution
  • Collaborate with AI teams to integrate model and agent evaluation testing into release pipelines
  • Lead and mentor QA engineers, conduct test design reviews, and promote automation best practices
  • Fulltime
Read More
Arrow Right

QA Engineering Lead, AI Native

Meta is seeking a QA Engineering Lead with expertise in AI product and model tes...
Location
Location
United States , Menlo Park
Salary
Salary:
138000.00 - 191000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 5+ years of experience in quality assurance, test engineering, and test automation
  • 1+ years of hands-on experience testing AI-powered products (web, iOS, and/or Android) that generate or transform text, images, and/or voice, including end-to-end feature validation and user experience quality
  • 1+ years of hands-on experience testing, debugging, and evaluating LLM/multimodal model behavior, including defining and applying quality standards for accuracy, relevance, grounding, safety/policy compliance, and cultural/locale sensitivity, and driving model-quality regressions to resolution
  • Experience effectively utilizing AI technologies and tools (e.g., large language models, agents, etc.) to enhance QA workflows
  • Experience collaborating cross-functionally and contributing to technical decisions through influence, communication, and execution
  • Experience changing priorities quickly and adapt effectively in a fast-moving product development cycle
Job Responsibility
Job Responsibility
  • Build and foster a quality-driven engineering environment that enables rapid, confident product releases, ensuring that quality is embedded throughout the development lifecycle
  • Develop and implement robust evaluation processes for AI models, including prompt engineering, scenario-based, and adversarial testing for text, image, and voice AI systems
  • Drive the quality for products and features, assess risks, and ensure features ship with a high quality bar, balancing speed and experience
  • Plan, develop, and execute comprehensive test strategies across core Meta products and platforms, leveraging both manual and automated approaches
  • Lead quality assurance efforts that align with product objectives, developing scalable solutions to support rapid product iteration and deployment
  • Solve cross-platform engineering challenges and contribute impactful ideas to improve quality, reliability, and user experience across diverse product surfaces
  • Implement and evolve QA processes to obtain effective test signals and scale testing efforts across multiple products, ensuring continuous improvement
  • Define quality metrics and implement measurements to determine test effectiveness, testing efficiency, and overall product quality, using data-driven insights to guide decisions
  • Partner with engineering and infrastructure teams to leverage automation for scalable solutions, preventing regressions and ensuring the reliability of products and AI models
  • Apply Responsible AI practices including safety, ethics, alignment, and explainability by building safeguards and quality controls to validate AI outputs, ensuring transparency, and compliance with ethical standards
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Techops Transformation Lead - Ai Support

My client is a market-leading organisation and recognised industry leader operat...
Location
Location
United Kingdom
Salary
Salary:
Not provided
xcede.com Logo
Xcede
Expiration Date
June 15, 2026
Flip Icon
Requirements
Requirements
  • Proven experience leading customer support transformations or operational insourcing programmes
  • Strong commercial acumen with the ability to build and defend financial business cases
  • Experience implementing AI-powered support platforms or CRM systems
  • Comfortable operating at pace in high-stakes, ambiguous environments
  • Strong stakeholder management across business, operations, finance, legal, and technology teams
  • Experience managing supplier transitions and service redesign
  • Clear communicator able to produce executive-ready outputs and drive decisions
Job Responsibility
Job Responsibility
  • Own and deliver the full transformation from discovery through to rollout with minimal oversight
  • Establish governance, reporting cadence, stakeholder communication, and executive decision points
  • Manage risks, blockers, and cross-functional delivery across Finance, Commercial, Legal, People, Systems, and Engineering
  • Assess current support operations including cost base, volumes, SLAs, supplier model, and customer satisfaction
  • Build and maintain the financial model and produce a board-ready business case
  • Identify savings opportunities across AI deflection, automation, productivity, supplier strategy, and operating model redesign
  • Lead evaluation and implementation of AI-enabled support platforms such as Zendesk, Salesforce Service Cloud, Intercom, and similar tools
  • Define selection criteria around AI capability, workflow automation, CRM integration, and reporting
  • Oversee platform setup, routing workflows, self-service capability, dashboards, and the long-term technology blueprint
  • Design a layered support model across self-service, operational support, escalations, and premium service functions
Read More
Arrow Right
New

Digital Quality Engineer Leader

The Quality Engineering Leader is responsible for defining, governing, and scali...
Location
Location
United States , Tampa; Toledo
Salary
Salary:
Not provided
owenscorning.com Logo
Owens Corning
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in relevant field, computer science, management information systems
  • or equivalent experience
  • 5+ years leading QA teams (multiple direct reports and/or contractors), with the ability to successfully lead projects and leaders across regions and third parties
  • 10+ years of experience in software testing and quality engineering across manual and automated practices
  • Demonstrated problem solving skills and the ability to make data-driven decisions
  • Experience with stakeholder management and effective communication across all levels of the organization
  • Experience operating within Agile delivery models with embedded QA resources
  • Possesses a combination of application development, IT strategy and relationship management experience that demonstrates an ability to both conceive and deliver solutions
  • Working knowledge of quality engineering principles and best practices
  • Proven ability to foster effective collaboration within culturally and geographically diverse quality engineering and test automation teams
Job Responsibility
Job Responsibility
  • Build strong partnerships with Engineering, Product, DevOps, and business stakeholders to align quality outcomes with enterprise objectives
  • Understand Owens Corning's business strategies and how digital platforms enable growth and customer value
  • Apply outside-in market insights and best-in-class quality practices to continuously improve enterprise quality capability
  • Develop and maintain a strong understanding of Owens Corning specific business processes and operations locally and globally
  • Define and execute the enterprise Quality Engineering strategy aligned to an embedded QA operating model with centralized standards
  • Establish and govern standards for test strategy, automation, coverage models, quality gates, and risk-based testing
  • Ensure quality practices scale across sprint teams without creating delivery bottlenecks
  • Evolve quality engineering capabilities to support future embedded and IoT domains
  • Set strategic direction for automation across web, API, and mobile layers
  • Ensure automation delivers fast, reliable, and actionable feedback within CI/CD pipelines
  • Fulltime
Read More
Arrow Right
New

Principal Software Consultant - AI/ML Engineer

As an ML Team Lead, you will be responsible for leading the technical direction ...
Location
Location
Pakistan , Lahore, Karachi, Islamabad
Salary
Salary:
Not provided
10pearls.com Logo
10Pearls
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in computer science, Artificial Intelligence, Data Science, Software Engineering, or a related field
  • 7+ years of professional software engineering experience with at least 5 years of hands-on experience building and deploying ML systems into production
  • Prior experience as a Tech Lead, Staff Engineer, or hands-on lead for AI/ML engineering teams
  • Strong expertise in classical machine learning domains such as forecasting, ranking, classification, and optimization
  • Hands-on experience building modern LLM and agentic AI systems including RAG pipelines, tool-using agents, multi-step workflows, and evaluation systems
  • Strong proficiency in Python and backend system development
  • Experience with ML frameworks such as PyTorch or TensorFlow
  • Strong understanding of scalable distributed systems, APIs, system integration, architecture design, and production engineering practices
  • Experience operating ML services at scale, including SLO management, monitoring, on-call practices, and incident response
  • Experience working with Kubernetes-based deployments, CI/CD pipelines, and modern cloud-native engineering practices
Job Responsibility
Job Responsibility
  • Lead the technical direction for the team’s ML and LLM systems, including architecture patterns, platform choices, evaluation frameworks, and engineering standards
  • Stay hands-on by designing and implementing complex ML and agentic AI systems, writing production-grade code, and leading through technical execution
  • Design, develop, and deploy scalable ML and LLM-powered applications and services in production environments
  • Build and optimize AI-powered solutions such as RAG systems, multi-step agents, AI assistants, chatbots, forecasting systems, ranking models, classification models, and optimization systems
  • Drive architecture and design reviews to ensure scalability, reliability, security, and maintainability of AI/ML systems
  • Own the technical roadmap for ML/LLM initiatives and translate business objectives into execution plans and scalable solutions
  • Collaborate closely with Product Managers, Engineers, Data Engineers, MLOps Engineers, QA Engineers, and cross-functional stakeholders to deliver business-aligned AI solutions
  • Establish engineering best practices for prompt engineering, model evaluation, regression testing, observability, and production readiness
  • Define and implement quality standards, evaluation suites, acceptance metrics, and regression plans for all AI/ML features
  • Ensure high availability, scalability, and resilience of tier-1 ML services through SLOs, monitoring, incident response, failover strategies, circuit breakers, and multi-zone deployments
  • Fulltime
Read More
Arrow Right