This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Start.io, a leading mobile marketing and audience platform, empowers the app ecosystem with cutting-edge solutions in mobile marketing, audience building, and monetization. With integration into over 500,000 monthly active apps and a global reach, Start.io leverages first-party data to deliver impactful and scalable advertising solutions. We’re looking for a highly skilled, independent, and driven Machine Learning Engineer to lead the design and development of our next-generation real-time inference services - the core engine powering Start.io’s algorithmic decision-making at scale. This is a rare opportunity to own the system at the heart of our product, serving billions of daily requests across mobile apps, with tight latency and performance constraints. You’ll work at the intersection of machine learning, large-scale backend engineering, and business logic, building robust services that blend predictive models with dynamic, engineering logic - all while maintaining extreme performance and reliability requirements.
Job Responsibility:
Own and lead the design and development of low-latency Algo inference services handling billions of requests per day
Build and scale robust real-time decision-making engines, integrating ML models with business logic under strict SLAs
Collaborate closely with DS to deploy models seamlessly and reliably in production
Design systems for model versioning, shadowing, and A/B testing at runtime
Ensure high availability, scalability, and observability of production systems
Continuously optimize latency, throughput, and cost-efficiency using modern tooling and techniques
Work independently while interfacing with cross-functional stakeholders from Algo, Infra, Product, Engineering, BA & Business
Requirements:
B.Sc. or M.Sc. in Computer Science, Software Engineering, or a related technical discipline
5+ years of experience building high-performance backend or ML inference systems
Deep expertise in Python and experience with low-latency APIs and real-time serving frameworks (e.g., FastAPI, Triton Inference Server, TorchServe, BentoML)
Experience with scalable service architecture, message queues (Kafka, Pub/Sub), and async processing
Strong understanding of model deployment practices, online/offline feature parity, and real-time monitoring
Experience in cloud environments (AWS, GCP, or OCI) and container orchestration (Kubernetes)
Experience working with in-memory and NoSQL databases (e.g. Aerospike, Redis, Bigtable) to support ultra-fast data access in production-grade ML services
Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry) and best practices for alerting and diagnostics
A strong sense of ownership and the ability to drive solutions end-to-end
Passion for performance, clean architecture, and impactful systems
What we offer:
Lead the mission-critical inference engine that drives our core product
Join a high-caliber Algo group solving real-time, large-scale, high-stakes problems
Work on systems where every millisecond matters, and every decision drives real value
Enjoy a fast-paced, collaborative, and empowered culture with full ownership of your domain
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.