This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a detail-oriented Senior QA Engineer to keep the quality assurance efforts across our complex data flows and AI-driven products, specifically focusing on conversational agents. This role requires a blend of automation expertise and strong analytical skills to define quality standards, validate complex data outputs, and ensure the reliable and responsible behavior of our generative AI systems.
Job Responsibility
Expertise in quality assurance efforts for data pipelines, APIs, and conversational agents
Design and execute generative AI testing scenarios, including exploratory testing for chatbot interactions, intent recognition, tone, and edge-case behavior
Experience in LLM test designs, including prompt and grounding validation, golden sets, and non-deterministic assertions
Perform crucial safety and quality checks, such as testing for hallucination, bias, toxicity, and PII leakage
Validate structured and unstructured data outputs, ensuring consistency, accuracy, and compliance
Establish and drive real-time testing approaches, including streaming data validation and API monitoring
Collaborate with ML and NLP teams to define comprehensive evaluation metrics and criteria for agent performance
Integrate AI-driven tools (like Copilot) into the QA lifecycle to accelerate test design, documentation, and defect analysis
Requirements
5+ years of experience in QA engineering, with proven experience in testing data systems or AI applications
Strong analytical skills and attention to detail, capable of spotting subtle inconsistencies in data and agent behavior
Proficiency in Python and SQL for data validation, automation scripting, and test preparation
Experience with API testing (REST/GraphQL automation and validation)
Familiarity with chatbot frameworks, LLMs, or conversational testing
(Desirable) Experience or strong interest in using LLM-based assistants (e.g., GitHub Copilot, ChatGPT) in test and execution