This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking an Senior Automation Engineer with 6+ years strong hands-on experience in AI/ML model validation and testing. This role focuses on ensuring the quality, reliability, accuracy, and robustness of AI/ML and LLM-based systems through systematic validation, data quality checks, and automated evaluation pipelines. The ideal candidate has solid experience working with supervised ML models, LLMs, and ML pipelines, strong Python-based validation scripting skills, and a deep understanding of data, model behavior, and evaluation metrics.
Job Responsibility:
Design and execute validation strategies for AI/ML models, including supervised learning models and large language models (LLMs)
Perform functional, regression, and behavioral testing of ML models across versions
Validate model outputs for accuracy, consistency, bias, and edge cases
Evaluate LLM responses for correctness, relevance, hallucinations, and safety
Design and implement data quality validation checks for training, validation, and inference datasets
Detect and analyze data drift and concept drift across model iterations and production data
Validate input data assumptions, feature distributions, and schema consistency
Collaborate with data and ML engineers to identify and resolve data-related issues impacting model performance
Develop Python-based validation scripts and frameworks to automate model evaluation, data validation, and regression testing across model versions
Implement automated checks for model performance metrics (precision, recall, F1, accuracy, etc.)
Build reusable validation utilities that integrate with ML workflows and pipelines
Execute LLM evaluation workflows including prompt-response validation, golden dataset comparisons, and regression testing across prompt or model changes
Contribute to evaluation strategies for RAG or multi-step LLM pipelines
Support responsible AI initiatives by testing bias, robustness, and failure modes
Work closely with ML engineers, data scientists, platforms, and product teams
Actively participate in requirement discussions and model review sessions
Clearly document test results, risks, and quality insights for stakeholders
Contribute to continuous improvement of AI testing practices and standards
Requirements:
4+ years of hands-on experience in AI/ML model testing, validation, or quality engineering
Strong proficiency in Python for model validation and data analysis
Experience validating supervised ML models and/or LLM-based systems
Solid understanding of ML evaluation metrics and validation techniques
Experience with data quality checks, drift detection, and dataset validation
Familiarity with ML pipelines and model lifecycle (training, validation, inference)
Ability to analyze model behavior and explain quality risks clearly