This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a Machine Learning Engineer to lead the design, development, and operation of production-grade machine learning infrastructure at scale. In this role, you will architect robust pipelines, deploy and monitor ML models, and ensure reliability, reproducibility, and governance across our AI/ML ecosystem. You will work at the intersection of ML, DevOps, and cloud systems, enabling our teams to accelerate experimentation while ensuring secure, efficient, and compliant deployments.
Job Responsibility:
Architect, design, and lead the implementation of the entire ML lifecycle
Develop and maintain highly automated, resilient systems for continuous training, testing, deployment, monitoring, and rollback of machine learning models
Establish and enforce state-of-the-art practices for model versioning, reproducibility, auditing, lineage tracking, and compliance
Develop comprehensive, real-time monitoring, alerting, and logging solutions
Act as the primary driver for efficiency, pioneering best practices in Infrastructure-as-Code (IaC), sophisticated container orchestration, and continuous delivery (CD)
Partner closely Security Teams, and Product Engineering to define requirements and deliver robust, secure, and production-ready AI systems
Continuously evaluate, prototype, and introduce cutting-edge tools, frameworks, and practices
Strategically manage and optimize ML infrastructure resources
Requirements:
4+ years of software/DevOps/ML engineering experience
At least 3+ years focused specifically on advanced MLOps, ML Platform, or production ML infrastructure
5+ years of experience building ML Models
Deep expertise in building scalable, production-grade systems using strong programming skills (Python, Go, or Java)
Expertise in leveraging cloud platforms (AWS, GCP, Azure) and container orchestration (Kubernetes, Docker) for ML workloads
Proven hands-on experience in the ML Infrastructure lifecycle
Mandatory Experience with Advanced Inferencing Techniques
Strong, hands-on experience with comprehensive CI/CD pipelines, infrastructure-as-code (Terraform, Helm), and robust monitoring/observability solutions (Prometheus, Grafana, ELK/EFK stack)
Comprehensive knowledge of data pipelines, feature stores, and high-throughput streaming systems (Kafka, Spark, Flink)
Expertise in operationalizing ML models
A strong track record of influencing cross-functional stakeholders
MS/PhD in Computer Science/Data Science, Engineering