This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We’re looking for a highly skilled Machine Learning Engineer to join our data and analytics team. You’ll design, build, and optimize scalable, high-performance data pipelines and ML workflows on Databricks, leveraging PySpark, Unity Catalog, and modern data lake house technologies such as Delta Lake and Apache Iceberg. This is an exciting opportunity to shape our ML infrastructure and production pipelines, working at the intersection of data engineering and applied machine learning. We supply our customers with an array of predictive data science models. We do this for all of our customers meaning we train and execute 1000’s of models that are uniquely configured to provide tailored results.
Job Responsibility:
Architect and implement high-throughput data pipelines using PySpark on the Databricks platform to support large-scale ML workloads
Develop and operationalise ML workflows and model pipelines leveraging industry-leading frameworks and MLOps practices
Drive data governance and metadata management via feature stores to ensure data quality, lineage and compliance
Work with modern data lake house storage formats such as Delta Lake and Apache Iceberg to deliver reliable, performant data architectures
Collaborate across cross-functional teams including data engineering, data science, analytics and product to translate business challenges into scalable ML solutions
Monitor, tune and optimise the performance, cost-efficiency and reliability of data/ML pipelines in a production environment
Stay abreast of emerging technologies and best practices in data engineering, ML infrastructure and pipeline design to lift our platform continuously
Requirements:
A strong track record (typically 5+ years) in data engineering or ML engineering, with significant experience in building large-scale data pipelines and production ML systems
Proficiency in Python, and strong hands-on experience with PySpark in distributed, big-data environments
Proven experience with Databricks (cluster config, job orchestration, runtime optimisation)
Deep understanding and practical use of Unity Catalog, Delta Lake and Apache Iceberg in data lake house architectures
Expertise in data modelling, partitioning strategies, performance tuning and best practices for high-throughput data systems
Familiarity with ML frameworks (e.g., TensorFlow, PyTorch, Scikit-learn) and experience deploying models into production
Experience with CI/CD pipelines for data and ML code
Excellent communication skills and ability to work collaboratively in agile team environments
Proven experience owning complex technical problems, guiding and influencing cross-team solutions and lifting platform standards
Comfortable driving infrastructure or platform-level initiatives, making architectural decisions that align with business outcomes
A proactive mindset with the ability to identify and evangelise tools, practices and improvements that elevate the team’s capability
Nice to have:
Experience on cloud platforms such as AWS, Azure or GCP
Exposure to streaming architectures (e.g., Structured Streaming, Kafka) or real-time ML
Background with feature stores, data governance frameworks or data quality tooling
Experience using Databricks or similar data platforms
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.