This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a Senior CVML Platform Engineer to help design, build, and evolve the platforms that support computer vision and ML workloads at scale. This role focuses on enabling ML teams through well-designed infrastructure, tooling, and workflows, rather than developing models or conducting ML research. The ideal candidate brings strong technical judgment, is comfortable navigating existing and evolving platforms, and can incrementally improve systems while maintaining reliability. We strongly prefer engineers with a DevOps or platform engineering background who have moved into ML-adjacent systems and are motivated by building durable foundations that other teams rely on. This role requires both hands-on engineering and the ability to influence platform direction through collaboration and thoughtful design.
Job Responsibility:
Design, build, and evolve platform capabilities that support ML training, batch inference, and model deployment workflows at scale
Own and improve core platform components (e.g., compute orchestration, data pipelines, inference systems) used by multiple teams across Blue River and John Deere
Continuously enhance platform reliability, scalability, and performance, with a focus on real-world ML workloads
Enable ML engineers to move faster by building intuitive, well-documented platform tools and workflows across the model lifecycle (experimentation, deployment, and iteration)
Improve model inference performance and throughput while balancing trade-offs among cost, latency, and reliability
Support and scale distributed training and inference systems, including frameworks such as Ray and related tooling
Develop and optimize hybrid compute environments (cloud + on-prem/GPU infrastructure) to support large-scale ML workloads
Build and maintain infrastructure leveraging Kubernetes, Slurm, and cloud platforms (AWS preferred)
Identify and resolve bottlenecks in compute, storage, and data movement pipelines
Evaluate existing platform systems and make thoughtful decisions on when to extend, refactor, or rebuild components
Drive improvements in system architecture, balancing short-term delivery with long-term platform health
Contribute to shaping the platform roadmap and technical direction in response to evolving business and ML needs
Partner closely with ML engineers, robotics teams, infrastructure teams, and product stakeholders to translate requirements into scalable platform solutions
Act as a technical bridge between teams, ensuring platform capabilities align with real-world use cases and constraints
Influence platform adoption and best practices across multiple teams
Support platform capabilities that enable simulation-based testing and validation of ML systems, including synthetic data workflows
Improve tooling that allows teams to test and validate models before production deployment
Provide technical guidance and mentorship to junior engineers on platform and systems design
Lead implementation efforts for key platform initiatives and ensure high-quality execution
Demonstrate strong ownership and accountability for delivering impactful platform improvements
Requirements:
5+ years of professional engineering experience, with a focus on platform, infrastructure, or systems engineering
Strong technical judgment, balancing the evolution of legacy platforms with the design and delivery of new, greenfield components shared across multiple teams and workloads
Excellent Python skills, used in production systems, tooling, and platform components
Solid understanding of ML systems and the end-to-end model development lifecycle, from experimentation to deployment and iteration
Hands-on experience or strong familiarity with cloud platforms (AWS preferred) and container orchestration systems such as Kubernetes and Slurm
Ability to partner effectively with ML engineers, infra teams, and product stakeholders to translate requirements into platform capabilities
Ability to quickly ramp up on new domains, tools, and complex existing systems
Nice to have:
Golang experience, particularly for platform or infrastructure components
Experience building or integrating ML pipelines using tools such as Kubeflow and/or Airflow
Understanding of model inference architectures, including performance, scalability, reliability, and cost considerations
Experience enabling distributed training and inference through platforms and frameworks such as Ray
Experience supporting ML systems in computer vision or robotics environments