CrawlJobs Logo

Senior Data Engineer - AI Focused

France, Paris · Job Posted January 06, 2026
Apply Position
Job Link Share

Job Description

At Doctolib, we're on a mission to transform healthcare through the power of AI. As a Senior Data Engineer, you'll play a key role in building and optimizing the data foundations within the AI Team to deliver safe, scalable, and impactful models. You will join a dedicated team working on data infrastructure for LLM, VLM and RAG-based systems, powering our new AI Medical Companion. Your work will ensure that our engineers and data scientists can train, evaluate, and deploy AI models efficiently on high-quality, well-structured, and compliant data.

Job Responsibility

  • Ensure high standards of data quality for AI model inputs
  • Design, build, and maintain scalable data pipelines on Google Cloud Platform (GCP) for AI and machine learning use cases
  • Implement data ingestion and transformation frameworks that power Retrieval systems and training datasets for LLMs and multimodal models
  • Architect and manage NoSQL and Vector Databases to store and retrieve embeddings, documents, and model inputs efficiently
  • Collaborate with ML and platform teams to define data schemas, partitioning strategies, and governance rules that ensure privacy, scalability, and reliability
  • Integrate unstructured and structured data sources (text, speech, image, documents, metadata) into unified data models ready for AI consumption
  • Optimize performance and cost of data pipelines using GCP native services (BigQuery, Dataflow, Pub/Sub, Cloud Storage, Vertex AI)
  • Contribute to data quality and lineage frameworks, ensuring AI models are trained on validated, auditable, and compliant datasets
  • Continuously evaluate and improve our data stack to accelerate AI experimentation and deployment

Requirements

  • Master’s or Ph.D. degree in Computer Science, Data Engineering, or a related field
  • 5+ years of experience in Data Engineering, ideally supporting AI or ML workloads
  • Strong experience with the GCP data ecosystem
  • Proficiency in Python and SQL, with experience in data pipeline orchestration (e.g., Airflow, Dagster, Cloud Composer)
  • Deep understanding of NoSQL systems (e.g., MongoDB) and vector databases (e.g., FAISS, Vector Search)
  • Experience designing data architectures for RAG, embeddings, or model training pipelines
  • Knowledge of data governance, security, and compliance for sensitive or regulated data
  • Familiarity with W&B / MLflow / Braintrust / DVC for experiment tracking and dataset versioning (extract snapshots, change tracking, reproducibility)
  • Familiarity with containerized environments (Docker, Kubernetes) and CI/CD for data workflows
  • A collaborative mindset and passion for building the data foundations of next-generation AI systems

What we offer

  • Free comprehensive health insurance for you and your children
  • Parent Care Program: additional leave on top of the legal parental leave
  • Free mental health and coaching services through our partner Moka.care
  • For caregivers and workers with disabilities, a package including an adaptation of the remote policy, extra days off for medical reasons, and psychological support
  • Work from EU countries and the UK for up to 10 days per year, thanks to our flexibility days policy
  • Work Council subsidy to refund part of a sport club membership or a creative class
  • Up to 14 days of RTT
  • Lunch voucher with Swile card

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Data Engineer - AI Focused

8 matching positions

Senior AI Data Engineer

The Senior AI Data Engineer (Applications Development Technology Lead Analyst - ...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of professional experience in software development with a focus on AI, machine learning, or agent-based systems
  • Strong proficiency in Python, SQL
  • Java is a plus
  • Solid understanding of core AI concepts, including knowledge representation, automated planning, decision-making under uncertainty, and multi-agent systems
  • Experience with machine learning frameworks (e.g., TensorFlow, PyTorch) and relevant libraries (e.g., Scikit-Learn, NumPy, Pandas)
  • Familiarity with large language models (LLMs) and their application in agentic systems
  • Familiarity with specific agent frameworks (e.g., LangChain, AutoGen, CrewAI, RAG) or research in multi-agent reinforcement learning
  • Experience in designing and implementing APIs for AI services
  • Experience with software development best practices, including version control (Git), CI/CD pipelines, testing, and code reviews
  • Excellent analytical and problem-solving skills with a creative approach to complex challenges
Job Responsibility
Job Responsibility
  • Design and implement intelligent agents, including their perception, reasoning, planning, and action execution modules
  • Develop scalable and robust architectures for agentic systems, ensuring high performance, reliability, and security
  • Integrate various machine learning models (e.g., LLMs, reinforcement learning, predictive models) to enhance agent capabilities and decision-making
  • Develop agents that can automate complex tasks, optimize workflows, and solve real-world problems across various domains
  • Utilize and contribute to agentic AI frameworks and development tools
  • Design and implement metrics and evaluation strategies for agent performance, continuously optimizing and improving agent behavior
  • Stay abreast of the latest advancements in AI, particularly in agent-based systems, autonomous AI, and related fields, and propose innovative solutions
  • Work closely with cross-functional teams including AI researchers, data scientists, product managers, and software engineers to integrate agentic solutions into broader products and services
  • Create comprehensive technical documentation for agent designs, implementations, and operational procedures
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Senior AI Data Engineer

The Senior AI Data Engineer (Applications Development Technology Lead Analyst - ...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of professional experience in software development with a focus on AI, machine learning, or agent-based systems
  • Strong proficiency in Python, SQL
  • Solid understanding of core AI concepts, including knowledge representation, automated planning, decision-making under uncertainty, and multi-agent systems
  • Experience with machine learning frameworks (e.g., TensorFlow, PyTorch) and relevant libraries (e.g., Scikit-Learn, NumPy, Pandas)
  • Familiarity with large language models (LLMs) and their application in agentic systems
  • Familiarity with specific agent frameworks (e.g., LangChain, AutoGen, CrewAI, RAG) or research in multi-agent reinforcement learning
  • Experience in designing and implementing APIs for AI services
  • Experience with software development best practices, including version control (Git), CI/CD pipelines, testing, and code reviews
  • Excellent analytical and problem-solving skills
  • Strong written and verbal communication skills
Job Responsibility
Job Responsibility
  • Design and implement intelligent agents, including their perception, reasoning, planning, and action execution modules
  • Develop scalable and robust architectures for agentic systems
  • Integrate various machine learning models to enhance agent capabilities and decision-making
  • Develop agents that can automate complex tasks, optimize workflows, and solve real-world problems
  • Utilize and contribute to agentic AI frameworks and development tools
  • Design and implement metrics and evaluation strategies for agent performance
  • Stay abreast of the latest advancements in AI
  • Work closely with cross-functional teams
  • Create comprehensive technical documentation
  • Appropriately assess risk when business decisions are made
  • Fulltime
Read More
Arrow Right

Senior Staff Data Engineer- ML & AI Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
adevinta.com Logo
Adevinta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience with a specific focus on the intersection of Data Engineering, MLOps, and AI Infrastructure
  • Deep knowledge of Spark internals, structured streaming, and performance tuning for large-scale data processing
  • Proven experience architecting end-to-end ML platforms for Traditional ML (Classic MLOps) while actively enabling the organization on Generative AI concepts
  • Strong background in building automated pipelines and ensuring system observability
  • Practical experience building infrastructure for Large Language Models, including managing the complexity of chaining models and tools
  • Solid experience serving models at low latency and high concurrency using containerized solutions
  • Ability to speak the language of AI/ML Engineers and effectively bridge the gap between experimental code and production systems
  • Expert level Python
  • Experience with PyTorch, Terraform, Terragrunt, Docker, Kubernetes, GitHub Actions, Datadog
  • Experience with Databricks AI Stack: MLflow, Mosaic AI, Unity Catalog, Feature Store, Databricks Model Serving, Vector Databases
Job Responsibility
Job Responsibility
  • Lead the evolution of our Machine Learning & AI Platform, designing the architecture for AI Agents and establishing patterns for Vector Databases
  • Act as a first mover: validate new Databricks features and integrate them into the platform
  • Write the guidelines for GenAI development, helping teams transition from notebook experiments to production-grade LLM applications
  • Design the Feature Store, manage the Model Registry, and set up the infrastructure for Vector Search and RAG (Retrieval Augmented Generation) workflows
  • Elevate the technical bar of the team, mentoring Staff and Senior engineers on design patterns, code quality, and architectural decisions
  • Translate complex requirements from ML Engineers and Data Scientists into robust engineering tickets and infrastructure roadmaps
What we offer
What we offer
  • An attractive Base Salary
  • Participation in our Short Term Incentive plan (annual bonus)
  • Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
  • A 24/7 Employee Assistance Program for you and your family
  • Fulltime
Read More
Arrow Right

Senior Staff Data Engineer- ML & AI Platform

At Marktplaats, data is at the heart of everything we do, but Intelligence is wh...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
adevinta.com Logo
Adevinta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience with a specific focus on the intersection of Data Engineering, MLOps, and AI Infrastructure
  • Deep knowledge of Spark internals, structured streaming, and performance tuning for large-scale data processing
  • Proven experience architecting end-to-end ML platforms for Traditional ML (Classic MLOps) while actively enabling the organization on Generative AI concepts
  • Strong background in building automated pipelines and ensuring system observability
  • Practical experience building infrastructure for Large Language Models, including managing the complexity of chaining models and tools
  • Solid experience serving models at low latency and high concurrency using containerized solutions
  • Ability to speak the language of AI/ML Engineers and effectively bridge the gap between experimental code and production systems
Job Responsibility
Job Responsibility
  • Lead the evolution of our Machine Learning & AI Platform, designing the architecture for AI Agents and establishing patterns for Vector Databases
  • Act as a first mover, validate new Databricks features and integrate them into the platform
  • Write the guidelines for GenAI development, helping teams transition from notebook experiments to production-grade LLM applications
  • Design the Feature Store, manage the Model Registry, and set up the infrastructure for Vector Search and RAG (Retrieval Augmented Generation) workflows
  • Elevate the technical bar of the team, mentoring Staff and Senior engineers on design patterns, code quality, and architectural decisions
  • Translate complex requirements from ML Engineers and Data Scientists into robust engineering tickets and infrastructure roadmaps
What we offer
What we offer
  • An attractive Base Salary
  • Participation in our Short Term Incentive plan (annual bonus)
  • Work From Anywhere: Enjoy up to 20 days a year of working from anywhere
  • A 24/7 Employee Assistance Program for you and your family
  • A collaborative environment with an opportunity to explore your potential and grow
  • Fulltime
Read More
Arrow Right

Senior/ Principal Backend Engineer - Data & AI Security (Cortex Cloud)

At Palo Alto Networks®, we’re united by a shared mission—to protect our digital ...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on software engineering experience, with deep expertise in at least one of the following: Kotlin/Java, Python, or Go
  • Experience working with different cloud services on at least one major cloud provider (AWS, Azure, GCP)
  • Proven experience designing and building large-scale, scalable cloud-based applications
  • Expertise in microservices architecture, including technologies like Kubernetes, Docker, GKE, EKS, or AKS
  • Experience with relational or NoSQL databases (e.g., MYSQL, PostgreSQL, MongoDB) and ORMs (e.g., JPA, Hibernate)
  • Bachelor of Science in Computer Science or equivalent practical experience (e.g., elite software unit in the military).
Job Responsibility
Job Responsibility
  • Drive innovation by designing and implementing impactful solutions that address client needs, contributing to the full feature development lifecycle from design to deployment
  • Take ownership of specific feature segments, ensuring high-quality code and robust functionality through meticulous attention to detail and a focus on execution
  • Proactively collaborate and exchange information with cross-functional teams, including product and infrastructure, to ensure seamless integration and alignment on shared objectives
  • Challenge the status quo by generating innovative ideas and actively participating in brainstorming sessions to foster product and architectural improvements
  • Actively engage in technical discussions, openly sharing knowledge and learning from others to solve complex problems and elevate team expertise
  • Design and build highly scalable, resilient, and secure cloud-based applications and microservices
  • Contribute to an agile and dynamic engineering culture, demonstrating a strong drive and outstanding communication skills to deliver results efficiently.
  • Fulltime
Read More
Arrow Right

Senior/ Principal Backend Engineer - Data & AI Security (Cortex Cloud)

In this role, you will be dedicated to safeguarding our clients’ data within the...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.it Logo
Palo Alto Networks Italia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on software engineering experience, with deep expertise in at least one of the following: Kotlin/Java, Python, or Go
  • Experience working with different cloud services on at least one major cloud provider (AWS, Azure, GCP)
  • Proven experience designing and building large-scale, scalable cloud-based applications
  • Expertise in microservices architecture, including technologies like Kubernetes, Docker, GKE, EKS, or AKS
  • Experience with relational or NoSQL databases (e.g., MYSQL, PostgreSQL, MongoDB) and ORMs (e.g., JPA, Hibernate)
  • Bachelor of Science in Computer Science or equivalent practical experience (e.g., elite software unit in the military)
Job Responsibility
Job Responsibility
  • Drive innovation by designing and implementing impactful solutions that address client needs, contributing to the full feature development lifecycle from design to deployment
  • Take ownership of specific feature segments, ensuring high-quality code and robust functionality through meticulous attention to detail and a focus on execution
  • Proactively collaborate and exchange information with cross-functional teams, including product and infrastructure, to ensure seamless integration and alignment on shared objectives
  • Challenge the status quo by generating innovative ideas and actively participating in brainstorming sessions to foster product and architectural improvements
  • Actively engage in technical discussions, openly sharing knowledge and learning from others to solve complex problems and elevate team expertise
  • Design and build highly scalable, resilient, and secure cloud-based applications and microservices
  • Contribute to an agile and dynamic engineering culture, demonstrating a strong drive and outstanding communication skills to deliver results efficiently
  • Fulltime
Read More
Arrow Right

Senior/ Principal Engineer Software - Data & AI Security (Cortex Cloud)

In this role, you will be dedicated to safeguarding our clients’ data within the...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on software engineering experience, with deep expertise in at least one of the following: Kotlin/Java, Python, or Go
  • Experience working with different cloud services on at least one major cloud provider (AWS, Azure, GCP)
  • Proven experience designing and building large-scale, scalable cloud-based applications
  • Expertise in microservices architecture, including technologies like Kubernetes, Docker, GKE, EKS, or AKS
  • Experience with relational or NoSQL databases (e.g., MYSQL, PostgreSQL, MongoDB) and ORMs (e.g., JPA, Hibernate)
  • Bachelor of Science in Computer Science or equivalent practical experience (e.g., elite software unit in the military)
Job Responsibility
Job Responsibility
  • Drive innovation by designing and implementing impactful solutions that address client needs, contributing to the full feature development lifecycle from design to deployment
  • Take ownership of specific feature segments, ensuring high-quality code and robust functionality through meticulous attention to detail and a focus on execution
  • Proactively collaborate and exchange information with cross-functional teams, including product and infrastructure, to ensure seamless integration and alignment on shared objectives
  • Challenge the status quo by generating innovative ideas and actively participating in brainstorming sessions to foster product and architectural improvements
  • Actively engage in technical discussions, openly sharing knowledge and learning from others to solve complex problems and elevate team expertise
  • Design and build highly scalable, resilient, and secure cloud-based applications and microservices
  • Contribute to an agile and dynamic engineering culture, demonstrating a strong drive and outstanding communication skills to deliver results efficiently
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer, Data for Embodied AI

The goal of this role is to build, scale, and optimise next-generation world mod...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in ML engineering, data engineering, or applied ML roles focused on large-scale data systems
  • Proven experience building and maintaining large-scale data pipelines for machine learning, including data ingestion, transformation, and validation
  • Strong Python fundamentals and experience with modern ML and data frameworks (e.g. PyTorch, Ray, Dask, Spark, or equivalent)
  • Solid understanding of multimodal data (video, lidar, sensor telemetry) and its challenges in large-scale training
  • Experience defining and tracking data quality metrics, conducting dataset analysis, and driving data-informed improvements in model performance
  • Demonstrated ability to work collaboratively with ML researchers, platform engineers, and product teams in a fast-paced, experimental environment
  • Strong problem-solving skills, a data-driven mindset, and the ability to translate research needs into reliable data solutions
Job Responsibility
Job Responsibility
  • Design and implement large-scale data acquisition, processing, and curation pipelines, owning the full lifecycle of high-quality datasets used to train advanced robotics and foundation models
  • Continuously improve dataset quality and utility through sophisticated data analysis, debugging, and experimentation
  • developing metrics, tests, and monitoring mechanisms that directly drive model performance improvements
  • Develop and scale multimodal data pipelines for ingestion, preprocessing, filtering, annotation, and storage across video, LiDAR, and telemetry modalities
  • Run systematic experiments on data ablations and composition to assess their impact on model training dynamics, generalisation, and downstream performance
  • Collaborate with ML researchers and platform engineers to ensure datasets are fit for purpose and efficiently integrated into large-scale training workflows
  • Build internal tools and workflows for dataset auditing, visualization, and versioning to streamline iteration and reproducibility
  • Advance best practices for data governance, reliability, and scalability across the data lifecycle
  • ensuring data safety, privacy, and long-term maintainability
Read More
Arrow Right