This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
This is where new knowledge is discovered. Baxter’s Research and Development teams work cross functionally to innovate, develop and introduce creative solutions for patients needs globally. From Scientists to Engineers, your work creates the products that save and sustain lives.
Job Responsibility:
Design and execute validation strategies for AI/ML models, including supervised learning models and large language models (LLMs)
Perform functional, regression, and behavioral testing of ML models across versions
Validate model outputs for accuracy, consistency, bias, and edge cases
Evaluate LLM responses for correctness, relevance, hallucinations, and safety
Design and implement data quality validation checks for training, validation, and inference datasets
Detect and analyze data drift and concept drift across model iterations and production data
Validate input data assumptions, feature distributions, and schema consistency
Collaborate with data and ML engineers to identify and resolve data-related issues impacting model performance
Develop Python-based validation scripts and frameworks to automate model evaluation, data validation, and regression testing across model versions
Implement automated checks for model performance metrics (precision, recall, F1, accuracy, etc.)
Build reusable validation utilities that integrate with ML workflows and pipelines
Execute LLM evaluation workflows including prompt‑response validation, golden dataset comparisons, and regression testing across prompt or model changes
Contribute to evaluation strategies for RAG (Retrieval-Augmented Generation) or multi‑step LLM pipelines
Support responsible AI initiatives by testing bias, robustness, and failure modes
Work closely with ML engineers, data scientists, platforms, and product teams
Actively participate in requirement discussions and model review sessions
Clearly document test results, risks, and quality insights for stakeholders
Contribute to continuous improvement of AI testing practices and standards
Requirements:
4+ years of hands-on experience in AI/ML model testing, validation, or quality engineering
Strong proficiency in Python for model validation and data analysis
Experience validating supervised ML models and/or LLM-based systems
Solid understanding of ML evaluation metrics and validation techniques
Experience with data quality checks, drift detection, and dataset validation
Familiarity with ML pipelines and model lifecycle (training, validation, inference)
Ability to analyze model behavior and explain quality risks clearly