CrawlJobs Logo

Machine Learning Platform / Backend Engineer

everseen.ai Logo

Everseen

Location Icon

Location:
Serbia; Romania, Belgrade

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are seeking a Machine Learning Platform/Backend Engineer to design, build, and maintain scalable infrastructure that empowers our data scientists and machine learning engineers to develop, train, benchmark, and monitor machine learning models efficiently. You will be instrumental in shaping our internal Machine Learning Platform and driving automation, reproducibility, and performance across the machine learning lifecycle.

Job Responsibility:

  • Design, build, and maintain scalable infrastructure that empowers data scientists and machine learning engineers
  • Own the design and implementation of the internal ML platform, enabling end-to-end workflow orchestration, resource management, and automation using cloud-native technologies (GCP/Azure)
  • Design and manage Kubernetes-based infrastructure for multi-tenant GPU and CPU workloads with strong isolation, quota control, and monitoring
  • Integrate and extend orchestration tools (Airflow, Kubeflow, Ray, Vertex AI, Azure ML or custom schedulers) to automate data processing, training, and deployment pipelines
  • Develop shared services for model behavior/performance tracking, data/datasets versioning, and artifact management (MLflow, DVC, or custom registries)
  • Build out documentation in relation to architecture, policies and operations runbooks
  • Share skills, knowledge, and expertise with members of the data engineering team
  • Foster a culture of collaboration and continuous learning by organizing training sessions, workshops, and knowledge-sharing sessions
  • Collaborate and drive progress with cross-functional teams to design and develop new features and functionalities
  • Ensure that the developed solutions meet project objectives and enhance user experience
  • Have influence over the technology stack and internal technical improvements, contributing to strategic decision-making
  • Based on requirements and a longer-term product and feature strategy, design and implement reusable, testable, efficient, and elegant code
  • Ensure adherence to coding standards and best practices
  • Create, maintain, and run unit tests for new and existing applications and services
  • Aim to deliver defect-free and well-tested solutions
  • Analyze and collect data from various sources such as log files, application stack traces, and thread dumps
  • Utilize data analysis to identify trends, patterns, and potential areas for improvement
  • Begin to implement changes based on data analysis
  • Create and maintain CI/CD integration using various tools
  • Automate the build, test, and deployment processes to ensure efficiency and reliability
  • Research and propose third-party software solutions to optimize system performance
  • Expand product capabilities by integrating compatible third-party solutions
  • Monitor update and tracking of third-party solutions' compatibility with Everseen stack according to internal development guidelines
  • Monitor production logs to identify and troubleshoot issues promptly
  • Ensure seamless operation and timely resolution of any anomalies to maintain system reliability
  • Responsible for creating, reviewing, and maintaining high-quality technical documentation to ensure clarity, consistency, and knowledge sharing within the development team

Requirements:

  • 4-5+ years of work experience in either ML infrastructure, MLOps, or Platform Engineering
  • Bachelors degree or equivalent focusing on the computer science field is preferred
  • Excellent communication and collaboration skills
  • Expert knowledge of Python
  • Experience with CI/CD tools (e.g., GitLab, Jenkins)
  • Hands-on experience with Kubernetes, Docker, and cloud services
  • Understanding of ML training pipelines, data lifecycle, and model serving concepts
  • Familiarity with workflow orchestration tools (e.g., Airflow, Kubeflow, Ray, Vertex AI, Azure ML)
  • A demonstrated understanding of the ML lifecycle, model versioning, and monitoring
  • Experience with ML frameworks (e.g., TensorFlow, PyTorch)
  • Experience with GPU orchestration (e.g., NVIDIA GPU Operator, MIG)
  • Experience with Infrastructure as Code (e.g., Terraform)
  • Experience with Data engineering tools (e.g., Snowflake, Databricks, BigQuery, Airbyte, Kafka)
  • Familiarity with feature stores and model registries
  • Exposure to large-scale distributed systems and performance optimisation
  • Ability to work with Linux systems, including troubleshooting skills such as log investigations, performance testing, and connectivity investigation
  • Possesses a deep understanding of technical concepts and terminology relevant to Everseen's products and services
  • Expert knowledge of advanced concepts like microservices and distributed systems
  • In-depth knowledge of Azure Kubernetes Services for container orchestration, Azure Blob Storage for data storage, and ElasticSearch for search and analytics
  • Ability to leverage cloud computing technologies and services for testing and validation purposes
  • In-depth knowledge of cloud security, scalability, and performance optimization principles
  • Excellent understanding of cloud computing technologies and services, including infrastructure as a service (IaaS), platform as a service (PaaS), and software as a service (SaaS)
  • Broad understanding of the software engineering and architecture space, including knowledge of various programming languages, frameworks, techniques, and industry trends in AI

Nice to have:

  • Interest in Learning and Growth Mindset
  • Demonstrated interest in learning and a strong desire to expand knowledge in their respective field
  • Curiosity to explore new technologies, methodologies, and best practices to enhance skills and capabilities
  • Results-oriented attitude, with a drive to achieve objectives efficiently
  • Analytical and Problem-Solving Skills
  • Possesses strong analytical and problem-solving abilities, leveraging data to inform product decisions

Additional Information:

Job Posted:
December 08, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.