ML Engineer - Inference Serving Job at Luma AI (Palo Alto)

ML Engineer (Production-focused)

We are looking for an ML Engineer with hands-on experience bringing models into ...

Location

France , Paris

Salary:

Not provided

Corsearch

Expiration Date

Until further notice

Requirements

3+ years of experience as an ML Engineer delivering models into production
Strong programming skills in Python (production-level)
Hands-on experience with PyTorch (preferred) or TensorFlow
Proven experience in model deployment / model serving
Experience optimizing inference (latency, resource usage, throughput)
Strong understanding of ML pipelines and automated workflows
Experience with Docker and containerized ML workloads
Ability to demonstrate measurable impact (e.g., uplift, precision improvements, latency reduction, stability gains)
Fluent spoken and written English
Located within a time zone aligned with CET (CET −2 to CET +4)

Job Responsibility

Build and maintain ML models for large-scale detection, classification, and automation tasks
Optimize inference performance (latency, throughput, memory)
Develop and maintain end-to-end ML pipelines: data processing, training, validation, deployment, monitoring
Integrate ML components into microservice-based architecture
Work closely with engineering teams to ensure reliability and performance in production
Improve tooling for model versioning, testing, and CI/CD

Senior Software Engineer - Network Enablement (Applied ML)

We build simple yet innovative consumer products and developer APIs that shape h...

Location

United States , San Francisco

Salary:

180000.00 - 270000.00 USD / Year

Plaid

Expiration Date

Until further notice

Requirements

Strong software engineering skills including systems design, APIs, and building reliable backend services (Go or Python preferred)
Production experience with batch and streaming data pipelines and orchestration tools such as Airflow or Spark
Experience building or operating real-time scoring and online feature-serving systems, including feature stores and low-latency model inference
Experience integrating model outputs into product flows (APIs, feature flags) and measuring impact through experiments and product metrics
Experience with model lifecycle and operations: model registries, CI/CD for models, reproducible training, offline & online parity, monitoring and incident response

Job Responsibility

Embed model inference into Network Enablement product flows and decision logic (APIs, feature flags, backend flows)
Define and instrument product + ML success metrics (fraud reduction, retention lift, false positives, downstream impact)
Design and run experiments and rollout plans (backtesting, shadow scoring, A/B tests, feature-flagged releases) to validate product hypotheses
Build and operate offline training pipelines and production batch scoring for bank intelligence products
Ship and maintain online feature serving and low-latency model inference endpoints for real-time partner/bank scoring
Implement model CI/CD, model/version registry, and safe rollout/rollback strategies
Monitor model/data health: drift/regression detection, model-quality dashboards, alerts, and SLOs targeted to partner product needs
Ensure offline and online parity, data lineage, and automated validation / data contracts to reduce regressions
Optimize inference performance and cost for real-time scoring (batching, caching, runtime selection)
Ensure fairness, explainability and PII-aware handling for partner-facing ML features

What we offer

medical
dental
vision
401(k)
equity
commission

Fulltime

ML Platform Engineer

We are seeking a Machine Learning Engineer to help build and scale our machine l...

Location

United States

Salary:

Not provided

Duetto

Expiration Date

Until further notice

Requirements

3+ years of experience in ML engineering or a similar role building and deploying machine learning models in production
Strong experience with AWS ML services (SageMaker, Lambda, EMR, ECR) for training, serving, and orchestrating model workflows
Hands-on experience with Kubernetes (e.g., EKS) for container orchestration and job execution at scale
Strong proficiency in Python, with exposure to ML/DL libraries such as TensorFlow, PyTorch, scikit-learn
Experience working with feature stores, data pipelines, and model versioning tools (e.g., SageMaker Feature Store, Feast, MLflow)
Familiarity with infrastructure-as-code and deployment tools such as Terraform, GitHub Actions, or similar CI/CD systems
Experience with logging and monitoring stacks such as Prometheus, Grafana, CloudWatch, or similar
Experience working in cross-functional teams with data scientists and DevOps engineers to bring models from research to production
Strong communication skills and ability to operate effectively in a fast-paced, ambiguous environment with shifting priorities

Job Responsibility

Develop, maintain, and scale machine learning pipelines for training, validation, and batch or real-time inference across thousands of hotel-specific models
Build reusable components to support model training, evaluation, deployment, and monitoring within a Kubernetes- and AWS-based environment
Partner with data scientists to translate notebooks and prototypes into production-grade, versioned training workflows
Implement and maintain feature engineering workflows, integrating with custom feature pipelines and supporting services
Collaborate with platform and DevOps teams to manage infrastructure-as-code (Terraform), automate deployment (CI/CD), and ensure reliability and security
Integrate model monitoring for performance metrics, drift detection, and alerting (using tools like Prometheus, CloudWatch, or Grafana)
Improve retraining, rollback, and model versioning strategies across different deployment contexts
Support experimentation infrastructure and A/B testing integrations for ML-based products

Senior ML Platform Engineer

At WHOOP, we're on a mission to unlock human performance and healthspan. WHOOP e...

Location

United States , Boston

Salary:

150000.00 - 210000.00 USD / Year

Whoop

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s Degree in Computer Science, Engineering, or a related field
or equivalent practical experience
5+ years of experience in software engineering with a focus on ML infrastructure, cloud platforms, or MLOps
Strong programming skills in Python, with experience in building distributed systems and REST/gRPC APIs
Deep knowledge of cloud-native services and infrastructure-as-code (e.g., AWS CDK, Terraform, CloudFormation)
Hands-on experience with model deployment platforms such as AWS SageMaker, Vertex AI, or Kubernetes-based serving stacks
Proficiency in ML lifecycle tools (MLflow, Weights & Biases, BentoML) and containerization strategies (Docker, Kubernetes)
Understanding of data engineering and ingestion pipelines, with ability to interface with data lakes, feature stores, and streaming systems
Proven ability to work cross-functionally with Data Science, Data Platform, and Software Engineering teams, influencing decisions and driving alignment
Passion for AI and automation to solve real-world problems and improve operational workflows

Job Responsibility

Architect, build, own, and operate scalable ML infrastructure in cloud environments (e.g., AWS), optimizing for speed, observability, cost, and reproducibility
Create, support, and maintain core MLOps infrastructure (e.g., MLflow, feature store, experiment tracking, model registry), ensuring reliability, scalability, and long-term sustainability
Develop, evolve, and operate MLOps platforms and frameworks that standardize model deployment, versioning, drift detection, and lifecycle management at scale
Implement and continuously maintain end-to-end CI/CD pipelines for ML models using orchestration tools (e.g., Prefect, Airflow, Argo Workflows), ensuring robust testing, reproducibility, and traceability
Partner closely with Data Science, Sensor Intelligence, and Data Platform teams to operationalize and support model development, deployment, and monitoring workflows
Build, manage, and maintain both real-time and batch inference infrastructure, supporting diverse use cases from physiological analytics to personalized feedback loops for WHOOP members
Design, implement, and own automated observability tooling (e.g., for model latency, data drift, accuracy degradation), integrating metrics, logging, and alerting with existing platforms
Leverage AI-powered tools and automation to reduce operational overhead, enhance developer productivity, and accelerate model release cycles
Contribute to and maintain internal platform documentation, SDKs, and training materials, enabling self-service capabilities for model deployment and experimentation
Continuously evaluate and integrate emerging technologies and deployment strategies, influencing WHOOP’s roadmap for AI-driven platform efficiency, reliability, and scale

What we offer

equity
benefits

Fulltime

Senior Machine Learning Engineer (Health)

WHOOP is an advanced health and fitness wearable, on a mission to unlock human p...

Location

United States , Boston

Salary:

150000.00 - 210000.00 USD / Year

Whoop

Expiration Date

Until further notice

Requirements

Bachelor’s Degree in Computer Science, Data Science, Applied Mathematics, or a related field. Master’s preferred
5+ years of professional experience as a Machine Learning Engineer or Software Engineer with focus on ML systems
Proven expertise working with time series data (wearable, physiological, or high-frequency sensor data strongly preferred)
Experience designing and deploying ML inference systems at scale: both real-time streaming and large-scale batch pipelines
Strong coding skills in Python (scientific stack) and SQL, with a track record of writing clean, production-quality code
Strong communication skills to collaborate across engineering, research, and product teams
Proven experience deploying and maintaining ML systems on cloud platforms (AWS or GCP)
Working familiarity with MLOps best practices: model versioning, CI/CD for ML, observability, and monitoring for inference systems
Ability to reason about and design for performance trade-offs (latency vs. throughput vs. cost) when building ML inference systems
Strong understanding of backend service development (APIs and service reliability) as it applies to serving ML models at scale

Job Responsibility

Create, improve, and maintain production services that provide analysis for health features in collaboration with Data Scientists and MLOps Engineers
Collaborate with Data Engineers to improve ML data pipelines, tooling, and validation systems that support robust model performance
Work alongside data scientists to translate research prototypes into production ML systems optimized for scale, latency, and cost efficiency
Collaborate with researchers and product teams to align model development with health insights and member impact
Participate in on-call rotations for data science services, ensuring uptime and performance in production environments

What we offer

equity
benefits

Fulltime

Machine Learning Engineer

Influur is redefining how advertising works — through creators, data, and AI. Ou...

Location

United States , San Francisco Bay Area

Salary:

200000.00 USD / Year

Influur

Expiration Date

Until further notice

Requirements

Strong experience designing, building, and maintaining end-to-end machine learning systems in production
Deep understanding of ML algorithms, embeddings, retrieval systems, and evaluation methodologies
Strong experience with large language models (LLMs), fine-tuning, inference optimization, and agent frameworks
Expertise in ML infrastructure, including feature stores, vector databases, model serving, and real-time inference pipelines
Strong Python skills and experience with PyTorch, TensorFlow, FastAPI, NumPy, scikit-learn, and data processing frameworks
Experience with scalable data pipelines (batch + streaming), including tools like Spark, Kafka, or similar
Experience implementing ML solutions such as recommendation engines, ranking models, and personalization systems
Solid understanding of statistical analysis (A/B testing, experimentation, causal inference)
Ability to work closely with engineering teams to productionize ML models with reliability, monitoring, and CI/CD best practices
Writes clean, reusable, and well-documented code for ML pipelines and distributed systems

What we offer

Competitive equity in a venture-backed company shaping the future of music influencer marketing
A seat at the table as we redefine how the most iconic record labels, artists, and brands go viral (think Bad Bunny) — with our tech, support, and strategic guidance
Access to elite tools, AI copilots, and a team that builds daily at top speed
Hybrid flexibility + top-tier health benefits

Fulltime

Machine Learning Engineer

Influur is redefining how advertising works — through creators, data, and AI. Ou...

Location

United States , Miami

Salary:

200000.00 USD / Year

Influur

Expiration Date

Until further notice

Requirements

Strong experience designing, building, and maintaining end-to-end machine learning systems in production
Deep understanding of ML algorithms, embeddings, retrieval systems, and evaluation methodologies
Strong experience with large language models (LLMs), fine-tuning, inference optimization, and agent frameworks
Expertise in ML infrastructure, including feature stores, vector databases, model serving, and real-time inference pipelines
Strong Python skills and experience with PyTorch, TensorFlow, FastAPI, NumPy, scikit-learn, and data processing frameworks
Experience with scalable data pipelines (batch + streaming), including tools like Spark, Kafka, or similar
Experience implementing ML solutions such as recommendation engines, ranking models, and personalization systems
Solid understanding of statistical analysis (A/B testing, experimentation, causal inference)
Ability to work closely with engineering teams to productionize ML models with reliability, monitoring, and CI/CD best practices

What we offer

Competitive equity in a venture-backed company shaping the future of music influencer marketing
A seat at the table as we redefine how the most iconic record labels, artists, and brands go viral
Access to elite tools, AI copilots, and a team that builds daily at top speed
Hybrid flexibility + top-tier health benefits

Fulltime

Machine Learning Engineer

Influur is redefining how advertising works — through creators, data, and AI. Ou...

Location

Salary:

200000.00 USD / Year

Influur

Expiration Date

Until further notice

Requirements

Strong experience designing, building, and maintaining end-to-end machine learning systems in production
Deep understanding of ML algorithms, embeddings, retrieval systems, and evaluation methodologies
Strong experience with large language models (LLMs), fine-tuning, inference optimization, and agent frameworks
Expertise in ML infrastructure, including feature stores, vector databases, model serving, and real-time inference pipelines
Strong Python skills and experience with PyTorch, TensorFlow, FastAPI, NumPy, scikit-learn, and data processing frameworks
Experience with scalable data pipelines (batch + streaming), including tools like Spark, Kafka, or similar
Experience implementing ML solutions such as recommendation engines, ranking models, and personalization systems
Solid understanding of statistical analysis (A/B testing, experimentation, causal inference)
Ability to work closely with engineering teams to productionize ML models with reliability, monitoring, and CI/CD best practices

What we offer

Competitive equity in a venture-backed company shaping the future of music influencer marketing
A seat at the table as we redefine how the most iconic record labels, artists, and brands go viral (think Bad Bunny) — with our tech, support, and strategic guidance
Access to elite tools, AI copilots, and a team that builds daily at top speed
Hybrid flexibility + top-tier health benefits

Fulltime

ML Engineer - Inference Serving

Luma AI

Location:
United States; United Kingdom , Palo Alto ▼
London

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
January 22, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for ML Engineer - Inference Serving

ML Engineer (Production-focused)

Senior Software Engineer - Network Enablement (Applied ML)

ML Platform Engineer

Senior ML Platform Engineer

Senior Machine Learning Engineer (Health)

Machine Learning Engineer

Machine Learning Engineer

Machine Learning Engineer

ML Engineer - Inference Serving

Luma AI

Location:United States; United Kingdom , Palo Alto ▼London

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:January 22, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for ML Engineer - Inference Serving

ML Engineer (Production-focused)

Senior Software Engineer - Network Enablement (Applied ML)

ML Platform Engineer

Senior ML Platform Engineer

Senior Machine Learning Engineer (Health)

Machine Learning Engineer

Machine Learning Engineer

Machine Learning Engineer

Location:
United States; United Kingdom , Palo Alto ▼
London

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
January 22, 2026