CrawlJobs Logo

Ml Platform Engineer

Romania, Bucharest B2B · Job Posted June 15, 2026
Apply Position
Job Link Share

Job Description

As an ML Platform / MLOps Engineer, you will design, build, and operate the infrastructure, tooling, and pipelines that make machine learning reliable at scale. You'll sit at the intersection of data engineering, DevOps, and applied ML - owning the platforms and systems that let data scientists and engineers move from experiment to production safely and repeatably. Your work will power intelligent products and internal automation across the company, and will help shape how the organisation adopts ML and AI responsibly.

Job Responsibility

  • Build and maintain MLOps automation end-to-end: CI/CD for models and pipelines, environment management, artifact versioning (models, data, prompts, code), and release governance
  • Implement and operate model serving infrastructure: deployment patterns (blue/green, canary, shadow), endpoint management, scaling, and latency/throughput optimisation
  • Build and maintain training and experimentation infrastructure: job orchestration, compute provisioning, experiment tracking, hyperparameter management, and reproducibility tooling
  • Implement observability for ML systems: data quality checks, feature drift detection, model performance monitoring, bias checks, alerting, and incident response workflows
  • Build and maintain data pipelines for ingestion, transformation, feature engineering, and export across multiple sources and destinations
  • Design and maintain a feature store or feature platform layer: serving consistency, point-in-time correctness, and reuse across teams
  • Expose well-governed datasets, features, and APIs that models, pipelines, and downstream consumers can rely on
  • Enforce secure data handling and compliance with relevant data protection standards, access controls, and audit requirements
  • Contribute to documentation, platform standards, and continuous improvement of ML engineering processes across teams

Requirements

  • Bachelor's degree in Computer Science, Engineering, Mathematics, or a related technical field (or equivalent practical experience)
  • 5+ years of Data or ML Engineering experience, with at least 3 years shipping ML systems to production
  • Strong Python skills (typed code, async, testing) and solid SQL fluency
  • Hands-on MLOps experience: model registries, experiment tracking (MLflow or Vertex Experiments), pipeline orchestration, and reproducible training runs
  • Strong DevOps fundamentals: CI/CD (GitHub Actions, Cloud Build, or similar), IaC (Terraform), containerization (Docker)
  • Familiarity with at least one major cloud provider (GCP, AWS, Azure) and deploying data solutions in the cloud
  • Experience building and maintaining data pipelines with orchestrators (Airflow/Composer, Dagster) and distributed engines (Spark, BigQuery)
  • Strong troubleshooting mindset: ability to debug issues across data, infra, pipelines, and deployments
  • Collaborative mindset and clear communication across engineering, analytics, and business stakeholders

Nice to have

  • Strong GCP experience and ecosystem knowledge: Vertex AI (Model Garden, Pipelines, Endpoints, Experiments, Monitoring), BigQuery, Composer, Dataproc, Cloud Run, Dataplex, Cloud Storage
  • Experience with data governance concepts: access control, retention, data classification, auditability, and compliance standards
  • Model monitoring experience: drift detection, data quality issues, performance degradation, bias checks, and alerting strategies
  • Experience building and maintaining agentic applications or LLM-powered tools using frameworks such as LangGraph, LlamaIndex, or the Anthropic/OpenAI Agents SDKs
  • Familiarity with MCP (Model Context Protocol) or comparable tool/function-calling protocols for LLM integrations

What we offer

  • Vibrant international team operating in hi-tech environment
  • Annual salary reviews, promotions and performance bonuses
  • myPOS Academy for upskilling and training
  • Unlimited access to courses on LinkedIn Learning
  • Annual individual training and development budget
  • Refer a friend bonus as we know that working with friends is fun
  • Teambuilding, social activities and networks on a multi-national level

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Ml Platform Engineer

8 matching positions

ML Platform Engineer

We are looking for a motivated and experienced Machine Learning Platform Enginee...
Location
Location
Israel , Tel-Aviv
Salary
Salary:
Not provided
khealth.com Logo
K Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in software engineering with experience in backend/platform roles
  • 5+ years of experience with Python
  • Proficiency in another language, such as C++, Rust, Java, or Go, is an advantage
  • 2+ years of experience working with cloud platforms such as Google Cloud (preferred), Azure, or AWS, including familiarity with ML workflow frameworks like KFP or Vertex Pipelines
  • Solid experience in ML/AI development (a must)
  • Experience with inference optimization (vLLM) and fine-tuning (Axolotl/Huggingface)
  • Expertise with transformers, PyTorch, CUDA, and other low-level ML libraries
  • Familiarity with Docker and Kubernetes
  • Excellent problem-solving skills and a proactive attitude, with a strong focus on code quality and optimization
  • Collaborative mindset with the ability to work closely with cross-functional teams. Strong communication and teamwork skills are essential
Job Responsibility
Job Responsibility
  • Design, develop, and maintain our machine learning ecosystem libraries
  • Build and manage data science code, Docker images, and Kubeflow Pipelines (KFP)
  • Create and maintain CI scripts to ensure seamless integration and delivery
  • Conduct thorough code reviews to uphold high-quality standards
  • Collaborate closely with data scientists, understanding and addressing their evolving needs
  • Work alongside software developers to seamlessly integrate machine learning models into production systems
  • Stay current with the latest advancements in machine learning, leveraging innovative techniques to enhance the company’s products and services
What we offer
What we offer
  • 20 paid vacation days, 18 days sick leave
  • Hybrid work schedule with team meals and stocked fridges
  • Commuter Benefits
  • Community focused events
  • Pension Plan
  • Stipend per day for food
  • Stock options for every full-time employee
  • Vocational Studies Fund
  • Fulltime
Read More
Arrow Right

ML Platform Engineer

We are seeking a Machine Learning Engineer to help build and scale our machine l...
Location
Location
United States
Salary
Salary:
Not provided
duettocloud.com Logo
Duetto
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience in ML engineering or a similar role building and deploying machine learning models in production
  • Strong experience with AWS ML services (SageMaker, Lambda, EMR, ECR) for training, serving, and orchestrating model workflows
  • Hands-on experience with Kubernetes (e.g., EKS) for container orchestration and job execution at scale
  • Strong proficiency in Python, with exposure to ML/DL libraries such as TensorFlow, PyTorch, scikit-learn
  • Experience working with feature stores, data pipelines, and model versioning tools (e.g., SageMaker Feature Store, Feast, MLflow)
  • Familiarity with infrastructure-as-code and deployment tools such as Terraform, GitHub Actions, or similar CI/CD systems
  • Experience with logging and monitoring stacks such as Prometheus, Grafana, CloudWatch, or similar
  • Experience working in cross-functional teams with data scientists and DevOps engineers to bring models from research to production
  • Strong communication skills and ability to operate effectively in a fast-paced, ambiguous environment with shifting priorities
Job Responsibility
Job Responsibility
  • Develop, maintain, and scale machine learning pipelines for training, validation, and batch or real-time inference across thousands of hotel-specific models
  • Build reusable components to support model training, evaluation, deployment, and monitoring within a Kubernetes- and AWS-based environment
  • Partner with data scientists to translate notebooks and prototypes into production-grade, versioned training workflows
  • Implement and maintain feature engineering workflows, integrating with custom feature pipelines and supporting services
  • Collaborate with platform and DevOps teams to manage infrastructure-as-code (Terraform), automate deployment (CI/CD), and ensure reliability and security
  • Integrate model monitoring for performance metrics, drift detection, and alerting (using tools like Prometheus, CloudWatch, or Grafana)
  • Improve retraining, rollback, and model versioning strategies across different deployment contexts
  • Support experimentation infrastructure and A/B testing integrations for ML-based products
Read More
Arrow Right

Principal ML Engineer, ML Platform Engineering

Xometry is seeking a Principal Machine Learning Engineer to join our core machin...
Location
Location
United States , North Bethesda
Salary
Salary:
140000.00 - 182000.00 USD / Year
cherry.vc Logo
Cherry Ventures
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 7 years of experience in machine learning engineering, software engineering, data science, or similar technical role
  • A bachelor’s degree is required, but an advanced degree (M.S. or PhD) in computer science, machine learning, AI, or a related field is preferred and may substitute for some years of experience
  • Demonstrated experience designing and deploying cloud infrastructure (AWS preferred) to support machine learning, and machine learning models, with considerations for scale, reliability and security
  • Deep understanding of the machine learning lifecycle and related infrastructure needs - feature stores, a/b testing, model registration, drift detection, automated retraining, etc
  • Strong technical expertise. You will need to either have or demonstrate the ability ability to quickly build technical expertise in the following: Software engineering principles, including parallel and distributed computing, version control, reproducibility, and continuous integration
  • Machine learning techniques and algorithms, with emphasis on their impact to infrastructure implementation Including large-scale language and vision models (Transformers, GPT, VLMs, LLMs), deep learning (PyTorch, Tensorflow)
  • Infrastructure as Code (IaC), especially Terraform
  • REST API design and implementation
  • Object oriented and functional programming in Python
  • Multimodal data processing (e.g., combining text, image, and 3D data)
Job Responsibility
Job Responsibility
  • Hands-On Technical Leadership: Adopt a 'lead by example' approach by actively coding and troubleshooting, as well as creating documentation and technical diagrams
  • Teaching & Mentorship: You will serve as a mentor and guide to engineers across the organization, teaching and mentoring them to grow their skills
  • Code Review: You will do code review and mentor others within the organization regarding best practices in ML Engineering
  • Operational Excellence: Guarantee the delivery of superior infrastructure and software that not only meets but exceeds customer expectations, while aligning with the strategic business timelines
  • Collaborative Strategy: Forge strong partnerships with product managers, data scientists, and company leadership to promote a culture of open communication and integrated team dynamics
  • Guide Innovation: Champion the adoption of cutting-edge technologies, methodologies, and practices to enhance problem-solving efficiency and effectiveness across the AI/ML organization.
What we offer
What we offer
  • 401(k) match
  • medical, dental and vision insurance
  • life and disability insurance
  • generous paid time off including vacation, sick leave, floating and fixed holidays, maternity and bonding leave
  • EAP, other wellbeing resources
  • Fulltime
Read More
Arrow Right

Senior ML Platform Engineer, AI Platform

We are seeking a skilled and passionate ML Platform Engineer to join our team an...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
airwallex.com Logo
Airwallex
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years in backend software development
  • at least 2+ years focus on AI/ML Platform or MLOps infrastructure
  • deep expertise in MLOps practices, including automated deployment pipelines, model optimization, and production lifecycle management
  • proven experience designing and implementing low-latency model serving solutions
  • proficiency in Python
  • skill in writing high-quality, maintainable code
  • experience in design and development of large-scale distributed, high concurrency, low-latency inference, high availability systems
  • excellent communication and mentoring abilities
  • a relevant degree in Computer Science, Mathematics or related fields
Job Responsibility
Job Responsibility
  • Platform Development: Design, build, and maintain the end-to-end MLOps platform using Kubernetes and Cloud Services
  • Infrastructure as Code (IaC): Use Terraform or similar tools to manage, provision, and scale all ML-related infrastructure securely and efficiently
  • Pipeline Automation: Implement and optimize CI/CD/CT (Continuous Integration, Delivery, Training) pipelines to automate model training, testing, packaging, and deployment using tools like Argo and Kubeflow Pipelines
  • Serving Infrastructure: Build highly available, low-latency, and high-throughput model serving infrastructure
  • Observability: Implement robust monitoring, alerting, and logging solutions to track infrastructure health, model performance, and data/model drift
  • Tooling & Support: Evaluate, integrate, and support ML tools such as Feature Stores and distributed model training pipelines
  • Security & Compliance: Ensure platform security, implement RBAC (Role-Based Access Control), and manage secrets for sensitive data and production environments
  • Collaboration: Work closely with Data Scientists and ML Engineers to understand their needs and provide technical guidance on best practices for scaling their models
  • Fulltime
Read More
Arrow Right

Full Stack Engineer, ML Platform

Prior Labs is building foundation models that understand tabular data, the backb...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
priorlabs.ai Logo
Prior Labs
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience building highly-available user-facing products
  • Experience working across the stack (frontend + backend) and comfort owning features end-to-end
  • Passion for user experience, data science and machine learning
  • Enthusiasm for working in a fast-moving, ambiguous environment where new ideas ship quickly and user impact matters most
  • Strong JavaScript/TypeScript
  • Strong Python
Job Responsibility
Job Responsibility
  • Own our user-facing frontend & backend applications end-to-end to build the next-generation of data science tools
  • Work with product and research to turn new model and agent capabilities into production features
  • Build high-trust, scalable systems
  • Optimize for speed, reliability, and security
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, ML Platform

We’re looking for a software engineer to join Parafin’s Infrastructure team and ...
Location
Location
United States , San Francisco
Salary
Salary:
230000.00 - 265000.00 USD / Year
parafin.com Logo
Parafin
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience, including experience on ML platform/MLOps systems (training, deployment, and/or feature pipelines)
  • Strong Python
  • solid software design and testing fundamentals
  • Proficiency with SQL
  • hands-on Spark/PySpark experience
  • Knowledge of ML fundamentals—probability & statistics, supervised vs. unsupervised learning, bias/variance & regularization, feature engineering, model evaluation metrics, validation strategies, and production concerns like drift, stability, and monitoring
  • Expertise with modern data/ML stacks—AWS, Databricks (workflows, lakehouse, MLflow/registry, Model Serving), and Airflow (or equivalent orchestration)
  • Experience building real-time systems (service design, caching, rate limiting, backpressure) and batch pipelines at scale
  • Practical knowledge of feature-store concepts (offline/online stores, backfills, point-in-time correctness), model registries, experiment tracking, and evaluation frameworks
  • Strong problem-solving skills and a proactive attitude toward ownership and platform health
Job Responsibility
Job Responsibility
  • Turn notebooks into software
  • Decompose data scientist training/inference notebooks into reusable, tested components (libraries, pipelines, templates) with clear interfaces and documentation
  • Create developer-friendly ML abstractions
  • Build SDKs, CLIs, and templates that make it simple to define features, train/evaluate models, and deploy to batch or real-time targets with minimal boilerplate
  • Build our real-time ML inference platform
  • Stand up and scale low-latency model serving
  • Expand batch ML inference
  • Improve scheduling, parallelism, cost controls, observability, and failure/rollback for large-scale batch scoring and post-processing
  • Own and expand the feature store
  • Design offline/online feature definitions, high read/write throughput, and consistent offline/online semantics
What we offer
What we offer
  • Equity grant
  • Medical, dental & vision insurance
  • Work from home flexibility
  • Unlimited PTO
  • Commuter benefits
  • Free lunches
  • Paid parental leave
  • 401(k)
  • Employee assistance program
  • Fulltime
Read More
Arrow Right

Sr. Staff ML Platform Engineer

Machine learning is the crucial enabler for every financial service that EarnIn ...
Location
Location
United States , Mountain View
Salary
Salary:
360000.00 - 440000.00 USD / Year
earnin.com Logo
EarnIn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Engineering, or a related field
  • 8+ years of industry machine learning experience and excellent software engineering skills
  • Strong programming skills in Python, with familiarity in ML frameworks such as TensorFlow or PyTorch
  • Experience with ML cloud platforms such as AWS Sagemaker, Databricks, or GCP Vertex AI
  • Familiarity with data pipelines and workflow management tools
  • Strong communication and collaboration skills
  • Passion for learning and staying updated with the latest industry trends in machine learning and platform engineering
Job Responsibility
Job Responsibility
  • Design, build, and maintain a robust ML platform and tooling ecosystem that supports the entire machine learning lifecycle, from experimentation to production
  • Lead and mentor a team of ML engineers, deeply understanding their workflows to streamline model training, deployment, and monitoring, while ensuring reproducibility and consistency of results
  • Drive scalability, reliability, and cost efficiency of the ML platform, balancing performance with ease of use for scientists and engineers
  • Evaluate and adopt emerging technologies to continually advance the organization’s machine learning capabilities and maintain a competitive edge
  • Champion operational excellence, setting a high bar for engineering quality, reliability, and automation
  • Act as a catalyst for innovation, spearheading step-change improvements that unlock new opportunities for growth and efficiency
What we offer
What we offer
  • equity and benefits
  • Fulltime
Read More
Arrow Right

Senior Platform Engineer, ML Data Systems

We’re looking for an ML Data Engineer to evolve our eval dataset tools to meet t...
Location
Location
United States , Mountain View
Salary
Salary:
137871.00 - 172339.00 USD / Year
khanacademy.org Logo
Khan Academy
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, or a related field
  • 5 years of Software Engineering experience with 3+ of those years working with large ML datasets, especially those in open-source repositories such as Hugging Face
  • Strong programming skills in Go, Python, SQL, and at least one data pipeline framework (e.g., Airflow, Dagster, Prefect)
  • Experience with data versioning tools (e.g., DVC, LakeFS) and cloud storage systems
  • Familiarity with machine learning workflows — from training data preparation to evaluation
  • Familiarity with the architecture and operation of large language models, and a nuanced understanding of their capabilities and limitations
  • Attention to detail and an obsession with data quality and reproducibility
  • Motivated by the Khan Academy mission “to provide a free world-class education for anyone, anywhere.”
  • Proven cross-cultural competency skills demonstrating self-awareness, awareness of other, and the ability to adopt inclusive perspectives, attitudes, and behaviors to drive inclusion and belonging throughout the organization.
Job Responsibility
Job Responsibility
  • Evolve and maintain pipelines for transforming raw trace data into ML-ready datasets
  • Clean, normalize, and enrich data while preserving semantic meaning and consistency
  • Prepare and format datasets for human labeling, and integrate results into ML datasets
  • Develop and maintain scalable ETL pipelines using Airflow, DBT, Go, and Python running on GCP
  • Implement automated tests and validation to detect data drift or labeling inconsistencies
  • Collaborate with AI engineers, platform developers, and product teams to define data strategies in support of continuously improving the quality of Khan’s AI-based tutoring
  • Contribute to shared tools and documentation for dataset management and AI evaluation
  • Inform our data governance strategies for proper data retention, PII controls/scrubbing, and isolation of particularly sensitive data such as offensive test imagery.
What we offer
What we offer
  • Competitive salaries
  • Ample paid time off as needed
  • 8 pre-scheduled Wellness Days in 2026 occurring on a Monday or a Friday for a 3-day weekend boost
  • Remote-first culture - that caters to your time zone, with open flexibility as needed, at times
  • Generous parental leave
  • An exceptional team that trusts you and gives you the freedom to do your best
  • The chance to put your talents towards a deeply meaningful mission and the opportunity to work on high-impact products that are already defining the future of education
  • Opportunities to connect through affinity, ally, and social groups
  • 401(k) + 4% matching & comprehensive insurance, including medical, dental, vision, and life.
  • Fulltime
Read More
Arrow Right