This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
An AI QA Engineer (Agents) is responsible for ensuring the quality, reliability, and performance of AI agents and agentic experiences. This role involves designing and executing test strategies, identifying defects, and working closely with engineering teams to ensure high‑quality releases of AI agent solutions for business use cases. This role requires a strong attention to detail, analytical thinking, and the ability to think like both a user and a developer. The ideal candidate is passionate about quality and enjoys finding creative ways to test AI agents and ensure they work correctly across various scenarios and edge cases. The ideal candidate has experience testing AI/ML applications, with particular strength in testing conversational interfaces, LLM integrations, and AI agent workflows. This person should also have the desire to grow and learn every day, which will be essential for success in this role. The landscape changes daily, and we are changing with it.
Job Responsibility
Design and execute test plans for AI agents and agentic experiences
Write and maintain automated test suites for agent functionality (unit tests, evals integration tests, end‑to‑end tests)
Perform (minimal)manual testing of agent interactions, workflows, and business logic
Test agent responses, accuracy, and behavior across various scenarios and edge cases
Identify, document, and track bugs through resolution
Collaborate with engineers, product managers, and business stakeholders to understand requirements and acceptance criteria
Participate in test planning, test case design, and test strategy discussions
Create and maintain test data, test scenarios, and test environments for agents
Participate in feature design sessions, highlighting key testing scenarios and fault zones
Execute performance and load testing to ensure agent scalability and response times
Validate agent integrations with business systems, APIs, and data sources
Test agent security features and validate compliance with security requirements
Participate in release planning and ensure quality gates are met before releases
Contribute to improving testing processes and test automation infrastructure for AI agents
Requirements
4+ years' total experience, including 1+ year testing AI/ML applications, LLM integrations, or conversational interfaces
Hands-on experience with end-to-end testing and automation for AI/agentic products
3+ years of experience in software quality assurance or testing
1+ years of experience testing AI/ML applications, LLM integrations, or conversational interfaces
Strong understanding of software testing principles, methodologies, and best practices
Experience writing and maintaining automated tests (unit, integration, or end‑to‑end)
Proficiency in at least one programming language (Python, TypeScript, JavaScript, Java, etc.)
Experience with API testing tools (Postman, REST Assured, etc.) or frameworks
Strong analytical and problem‑solving skills
Excellent attention to detail and ability to identify edge cases
Good written and verbal communication skills
Experience with bug tracking systems and test management tools
Ability to work collaboratively with engineering and product teams
Understanding of CI/CD pipelines and test automation in continuous integration
Interest in AI/ML concepts and understanding of how to test AI systems
Nice to have
Experience with AI Eval tools or frameworks
Experience testing AI agents, chatbots, or virtual assistants
Background in testing LLM integrations and prompt‑based systems
Experience with agent testing frameworks and tools
Knowledge of testing RAG (Retrieval Augmented Generation) systems
Experience with performance testing tools (JMeter, k6, Locust, etc.)
Experience with test automation frameworks (Playwright, Cypress, Selenium, pytest, etc.)
Familiarity with cloud platforms and testing cloud‑native applications
Experience with observability tools and using metrics/logs for test validation
Knowledge of security testing and vulnerability assessment for AI applications
Experience with contract testing and API mocking
Familiarity with prompt testing and LLM response validation