This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
This role will report to the Data Science Enablement Manager and support the Rare Disease Business Unit (RDBU) patient finding team. The candidate will work closely with data scientists to build scalable pipelines, productionize models, and establish robust evaluation and monitoring frameworks to enable reliable, high-impact deployment of patient finding solutions.
Job Responsibility:
Build and maintain scalable data and ML pipelines to support patient finding use cases across the patient journey
Productionize machine learning models by developing deployment workflows, APIs, and batch/real-time scoring pipelines
Design and implement model evaluation, validation, and monitoring frameworks (performance tracking, drift detection, alerting)
Enable end-to-end ML lifecycle management, including training, versioning, deployment, and retraining workflows
Partner with RDBU data science teams to translate analytical solutions into production-ready systems
Develop ML-ready datasets and feature pipelines, ensuring data quality, consistency, and reusability
Support model tracking and experiment management using standardized tools and frameworks
Build tools and utilities to monitor, track, and operationalize model outputs for downstream consumption
Collaborate with enterprise data and platform teams to ensure compliance with data governance, security, and architecture standards
Follow engineering best practices for code quality, documentation, testing, and CI/CD integration
Requirements:
Bachelor’s or Master’s in Computer Science, Data Engineering, or related technical field
3–5 years of experience in ML engineering, data engineering, or related roles
Strong programming skills in Python and SQL
Experience with data pipeline development and distributed computing (e.g., Spark/PySpark)
Working knowledge of Databricks and at least one cloud platform (AWS, Azure, or GCP)
Experience with ML lifecycle tools (e.g., MLflow, Git, CI/CD pipelines)
Understanding of model deployment, monitoring, and reproducibility practices
Nice to have:
Experience supporting production ML systems in healthcare or commercial analytics contexts
Familiarity with model monitoring concepts (data drift, model decay, performance tracking)
Experience building feature stores or reusable data assets
Exposure to patient journey or patient finding use cases is a plus
Experience with containerization and orchestration frameworks
Strong collaboration skills and ability to work closely with data science and analytics teams
Passion for building scalable systems and enabling data science teams