CrawlJobs Logo

Senior / Lead Machine Learning Engineer, Serving

Germany · Job Posted April 23, 2026
Apply Position
Job Link Share

Job Description

Inworld is a product-oriented research lab of top AI researchers and engineers, developing best-in-class realtime multimodal models and the only realtime orchestration platform optimized for thousands of queries per second. We’ve raised more than $125M from Lightspeed, Section 32, Kleiner Perkins, Microsoft’s M12 venture fund, Founders Fund, Meta and Stanford, among others. Our technology has powered experiences from companies such as NVIDIA, Microsoft Xbox, Niantic, Logitech Streamlabs, Wishroll, Little Umbrella and Bible Chat. We’ve also been recognized by CB Insights as one of the 100 most promising AI companies globally and have been named one of LinkedIn's Top 10 Startups in the USA.

Requirements

  • Inference Optimization. Deep understanding of modern serving frameworks and techniques like vLLM or TRT-LLM
  • Model Acceleration. Hands-on experience with quantization, distillation, caching strategies, continuous batching, paged attention, and speculative decoding
  • High-Performance Systems. Proficiency in C++, CUDA, Rust, or highly optimized Python. You know how to profile code and squeeze every ounce of performance out of NVIDIA GPUs
  • Distributed Systems & Scaling. Experience with Kubernetes, Ray, custom load balancing, multi-GPU/multi-node inference, and reliably handling thousands of concurrent connections
  • Public work. Non-trivial systems programming projects, open-source contributions to major inference engines, or deep-dive technical write-ups
  • Full-cycle ownership. You can take a model from the research team, containerize it, optimize its serving, and ensure it runs reliably in production
  • Background. PhD in CS, Physics, Math, or equivalent practical experience building backend or ML systems
  • Professional fluency in English (written and spoken) is required, as you will be collaborating daily with our US-based leadership and engineering teams

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior / Lead Machine Learning Engineer, Serving

8 matching positions

Senior / Lead Machine Learning Engineer, Serving

Location
Location
Serbia
Salary
Salary:
Not provided
inworld.ai Logo
Inworld AI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Inference Optimization
  • Model Acceleration
  • High-Performance Systems
  • Distributed Systems & Scaling
  • Public work
  • Full-cycle ownership
  • Background
  • Professional fluency in English
Read More
Arrow Right

Senior Machine Learning Engineer - Maps

The Places Data Team owns Uber's "Ground Truth" — the definitive dataset of POIs...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Ph.D., M.S. or Bachelor's degree in Computer Science, Machine Learning, or Operations Research, or equivalent technical background with exceptional demonstrated impact
  • 4+ years of experience in developing and deploying machine learning models and optimization algorithms in large-scale production environments, delivering measurable business impact over multiple quarters and making significant technical contributions
  • Proficiency in programming languages such as Python, Scala, Java, or Go
  • Experience with large-scale data systems (e.g. Spark, Ray), real-time processing (e.g. Flink), and microservices architectures
  • Experience in the development, training, productionization and monitoring of ML solutions at scale, ranging from offline pipelines to online serving and MLOps
Job Responsibility
Job Responsibility
  • Design, develop and productionize end-to-end ML solutions for places data conflation (POI, addresses, BFP, etc.) and attribute inference using a mix of classical ML, deep learning, and generative AI
  • Collaborate with product, science, and engineering teams to execute on the technical vision and roadmap
  • Conduct rigorous experimentation and A/B testing to validate model performance and iterate on improvements
  • Own projects from initial mathematical formulation through to prototyping, algorithm implementation, and large-scale experimentation in production
  • Raise the technical bar for the team. You will mentor L3/L4 engineers, lead complex code reviews, and foster a culture of engineering excellence and scientific rigor
Read More
Arrow Right

Senior Machine Learning Engineer

AI has created an unprecedented opportunity to make work better for hundreds of ...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python with experience in at least one deep learning framework such as PyTorch, JAX, or TensorFlow
  • OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Lead the design and architecture of ML solutions for projects/sub-systems
  • Select appropriate models, training regimes, and serving approaches
  • Produce maintainable, efficient, and explainable ML code
  • Drive monitoring for model drift, bias/fairness, and reliability
  • Mentor early-in-profession engineers, provide design/code reviews, and raise quality standards
  • Partner with other teams to ensure integrated ML systems are production-ready
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer, Computer Vision - Robotics

Scale’s Robotics business unit is dedicated to solving the data bottleneck in Ph...
Location
Location
United States , San Francisco
Salary
Salary:
218400.00 - 273000.00 USD / Year
scale.com Logo
Scale
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Ph.D. in Computer Science, Computer Engineering, or a related quantitative field (Mathematics, Electrical Engineering, etc.) OR a Master’s degree with 4+ years of equivalent professional experience in an applied research setting
  • 5+ years of hands-on experience in algorithm development for 2D/3D computer vision and deep learning
  • Expert proficiency in at least one major deep learning framework (PyTorch, TensorFlow or Jax)
  • Mastery of Python for machine learning and strong proficiency in C++ for performance-critical algorithm implementation
  • In-depth knowledge of classical and modern computer vision fundamentals, including multi-view geometry, projective geometry, camera calibration, and 3D graphics/rendering principles
  • Building real-time and batch ML systems that analyze structured and unstructured signals
  • Hands-on experience rapidly prototyping and iterating on ML systems with changing requirements
Job Responsibility
Job Responsibility
  • Pioneer Core CV Algorithms: Lead the research, design, and implementation of novel computer vision and deep learning algorithms, with a specialized focus on 2D and 3D data (e.g point clouds)
  • Focus Area Expertise: Drive innovation in key perception areas, including: 3D Reconstruction and SLAM: Advanced techniques for real-time 3D mapping, pose estimation, and environmental modeling from multi-modal sensor inputs (e.g., RGB-D, LiDAR). Hand/Body Tracking: Developing robust and precise models for hand pose estimation, gesture recognition, and full-body tracking under various lighting and occlusion conditions. Object Detection and Tracking (MOT/SOT): Designing high-performance deep learning models for accurate detection and persistent tracking of objects and people in video streams. Video Processing: Creating algorithms for temporal feature extraction, video-based action recognition, and motion analysis
  • Model Optimization: Optimize computationally intensive models for deployment on edge devices (low power, low latency) and/or large-scale cloud infrastructure
  • Technical Leadership: Serve as the subject matter expert in Computer Vision, providing technical direction and mentorship to junior engineers and cross-functional teams
  • Publication & IP: Maintain state-of-the-art knowledge, evaluate recent academic publications (e.g., CVPR, ICCV, ECCV), and drive the filing of patents and publication of novel research
  • Cross-Functional Partnering: Collaborate closely with Software Engineering, Product, and Hardware teams to define requirements, integrate vision systems, and ensure solutions meet performance targets
What we offer
What we offer
  • Comprehensive health, dental and vision coverage
  • retirement benefits
  • a learning and development stipend
  • generous PTO
  • equity based compensation
  • may be eligible for additional benefits such as a commuter stipend
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

As a Senior Machine Learning Engineer, you will take end-to-end ownership of the...
Location
Location
Canada
Salary
Salary:
128000.00 - 160000.00 CAD / Year
freshbooks.com Logo
FreshBooks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in data science, applied ML, or ML engineering roles
  • Strong background in supervised and unsupervised learning, statistical modeling, and experimentation techniques
  • Proven experience developing and shipping ML models in production environments (batch or real-time)
  • Strong Python and SQL skills
  • comfort working with structured and unstructured data
  • Hands-on experience building and deploying ML or LLM-based systems (e.g. retrieval-augmented generation, embeddings, prompt tuning)
  • Familiarity with cloud infrastructure and ML tools, ideally on Google Cloud Platform (e.g. Vertex AI, BigQuery, Cloud Composer, Kubernetes)
  • Experience working with CI/CD pipelines, containerization (Docker), and job orchestration tools (Airflow, dbt, etc.)
  • Deep understanding of end-to-end ML operations including model observability, model drift detection, and model performance optimization
  • Strong communication skills and ability to explain technical concepts to non-technical stakeholders
Job Responsibility
Job Responsibility
  • Design, prototype, and validate machine learning models to power product features or internal tools
  • Own and lead all phases of the ML lifecycle from experimentation through to production deployment and model monitoring
  • Collaborate with Data Engineers and Product Engineers to integrate models into production infrastructure (batch and online serving)
  • Develop and prototype features for the shared feature store, including documentation, versioning, and consistency validation
  • Author high-quality, production-ready code with appropriate tests, observability, and monitoring hooks
  • Design experiments (e.g. A/B tests, pre-post analyses) and interpret results to guide product and business decisions
  • Design and build end-to-end pipelines for classification, ranking, embeddings, or generation tasks
  • Drive reliability practices in deployed models, including retraining logic, alerting on drift, and root cause analysis
  • Work closely with product and engineering stakeholders to align ML work with business priorities
  • Contribute to standards and documentation, mentor junior team members, and help shape our evolving ML platform
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

We are seeking a Senior Machine Learning Engineer to bridge the gap between adva...
Location
Location
Switzerland , Zürich
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Master’s or PhD in Computer Science, Artificial Intelligence, or High-Performance Computing
  • Minimum 4+ years of experience in Machine Learning, with a mandatory split focus between Model Architecture and Systems Optimization
  • Proven experience building and shipping Vision-Language Models (e.g., architectures similar to CLIP, Flamingo, Pix2Struct)
  • Must have experience creating custom evaluation sets for tasks like Document Understanding
  • Expert-level knowledge of SGLang and vLLM for optimized serving
  • Demonstrable experience optimizing models for both NVIDIA (H100) and AMD (MI300x) accelerators
  • Hands-on experience with Knowledge Distillation and Pruning to reduce model latency for target serving sizes
  • A track record of taking complex multi-modal models from research code to a deployed, user-facing production product
Job Responsibility
Job Responsibility
  • Continuously evaluate and implement the latest research trends in Vision-Language Models, specifically focusing on Referring Expression Comprehension (REC), Document Understanding (Pix2Struct), and Visual Question Answering (VQA)
  • Design and build massive-scale training and evaluation datasets, ensuring multilingual compatibility and broad visual understanding for European market requirements
  • Lead the model co-design process, creating architectures that are natively optimized for accelerator capabilities (compute-bound vs. memory-bound operations)
  • Architect high-throughput serving layers using SGLang and vLLM, optimizing for non-standard decoding strategies
  • Implement scientific experiments to find the Pareto-optimal frontier between serving latency and generation quality
  • Execute Knowledge Distillation (KD), unstructured pruning, and quantization techniques to fit large-scale VLM architectures onto single-node GPU setups (specifically H100 or MI300x) without compromising model quality
  • Write and optimize custom kernels (CUDA/HIP) to accelerate serving latency, identifying bottlenecks at the operator level
  • Manage the full pre-training and post-training tech stack, ensuring seamless integration between model weights and inference engines
  • Take ownership of landing the serving-efficient model in a production environment, ensuring reliability and scalability
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

As a Senior Machine Learning Engineer at Arrive, you will serve as a key technic...
Location
Location
United States , Austin
Salary
Salary:
Not provided
arrivelogistics.com Logo
Arrive Logistics
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related field or equivalent professional experience
  • 5+ years of experience with ML ops, model serving and optimization
  • 5+ years of experience with Python, object oriented programming and building highly scalable backend services
  • Expertise in frameworks like Sklearn, Pandas, Numpy
  • 3+ years of experience with relational databases
  • 2+ years in a lead or senior-level capacity
  • 2+ years of experience designing maintainable and scalable systems
  • Proven expertise in system design with a focus on distributed systems and event-driven architectures
  • Experience developing cloud-native dockerized applications in Kubernetes
  • Experience working with online experimentation and platforms like Statsig
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ML systems and infrastructure using Python, Postgres, and Elasticsearch
  • Lead sprints, conduct rigorous code reviews, and set the “gold standard” for ML engineering practices across the organization
  • Actively mentor junior and mid-level engineers, fostering a culture of technical excellence and professional growth
  • Partner closely with other Machine Learning Engineers, Product Managers, Data Scientists, Data Engineers, and Product Engineers to ensure the successful delivery of strategic and roadmap initiatives
  • Own systems throughout the software development lifecycle, from design to development, deployment and monitoring
  • Maintain and improve performance of existing data systems and processes while balancing maintainability, observability and readability
  • Participate in an on-call rotation where you will support incidents and questions about service behavior from product managers
  • Develop a thorough understanding of a domain and explain the behavior of and contribute to code bases that may be outside your domain
  • Proactively propose solutions to gaps or risks in process, technology, software design and architecture
  • Provide rigorous and detailed code reviews that uphold team standards, testing and software design best practices
What we offer
What we offer
  • Comprehensive benefits package, including medical, dental, vision, life, disability, and supplemental coverage
  • Matching 401(k) program
  • Employee Resource Groups
  • Office wide engagement activities, team events, happy hours and more
  • Casual dress code
  • Convenient location close to the airport and downtown
  • Free on site parking
  • Fully stocked coffee bar, Broker’s Brew
  • Onsite gym
  • Free counseling sessions through Employee Assistance Program
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer

Start.io, a leading mobile marketing and audience platform, empowers the app eco...
Location
Location
Salary
Salary:
Not provided
start.io Logo
Start.io
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • B.Sc. or M.Sc. in Computer Science, Software Engineering, or a related technical discipline
  • 5+ years of experience building high-performance backend or ML inference systems
  • Deep expertise in Python and experience with low-latency APIs and real-time serving frameworks (e.g., FastAPI, Triton Inference Server, TorchServe, BentoML)
  • Experience with scalable service architecture, message queues (Kafka, Pub/Sub), and async processing
  • Strong understanding of model deployment practices, online/offline feature parity, and real-time monitoring
  • Experience in cloud environments (AWS, GCP, or OCI) and container orchestration (Kubernetes)
  • Experience working with in-memory and NoSQL databases (e.g. Aerospike, Redis, Bigtable) to support ultra-fast data access in production-grade ML services
  • Familiarity with observability stacks (Prometheus, Grafana, OpenTelemetry) and best practices for alerting and diagnostics
  • A strong sense of ownership and the ability to drive solutions end-to-end
  • Passion for performance, clean architecture, and impactful systems
Job Responsibility
Job Responsibility
  • Own and lead the design and development of low-latency Algo inference services handling billions of requests per day
  • Build and scale robust real-time decision-making engines, integrating ML models with business logic under strict SLAs
  • Collaborate closely with DS to deploy models seamlessly and reliably in production
  • Design systems for model versioning, shadowing, and A/B testing at runtime
  • Ensure high availability, scalability, and observability of production systems
  • Continuously optimize latency, throughput, and cost-efficiency using modern tooling and techniques
  • Work independently while interfacing with cross-functional stakeholders from Algo, Infra, Product, Engineering, BA & Business
What we offer
What we offer
  • Lead the mission-critical inference engine that drives our core product
  • Join a high-caliber Algo group solving real-time, large-scale, high-stakes problems
  • Work on systems where every millisecond matters, and every decision drives real value
  • Enjoy a fast-paced, collaborative, and empowered culture with full ownership of your domain
Read More
Arrow Right