CrawlJobs Logo

ML Infrastructure Engineer

applovin.com Logo

AppLovin

Location Icon

Location:
United States , Palo Alto

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

124000.00 - 186000.00 USD / Year

Job Description:

As a member of our software engineering infra team, you'll solve technical challenges, including upgrading and implementing state-of-the-art software infrastructure. The team builds a high-performance, high availability, globally distributed ecosystem platform of services that in turn provide the foundation for rapid development of novel new systems that integrate into that ecosystem and improve it. Our infra team is responsible for providing and maintaining scalable infrastructure with high throughput and low latency to our bidding ecosystem. You will be exposed to the whole pipeline of model delivery, including training, serving, and optimizations, etc.

Job Responsibility:

  • Design, develop, and maintain large-scale distributed systems
  • Collaborate with various engineering teams to meet a wide range of technological challenges
  • Work closely with our research science team and backend team to contribute and influence the roadmap of our products and technologies
  • Influence and inspire team members
  • Speed up the performance of our online models
  • Optimize the model delivery pipeline

Requirements:

  • 0-2 years of experience
  • Minimum of a BS and/or MS in Computer Science
  • Excellent knowledge of computer science fundamentals including data structures, algorithms, and coding
  • Good experience with C++, Python and/or Golang is a plus
  • Experience independently creating and maintaining projects

Nice to have:

Good experience with C++, Python and/or Golang

What we offer:
  • Equity eligible
  • Medical, Dental, Vision, Life, Disability insurance
  • 401(k) Retirement Plan
  • Unlimited Discretionary Time Off
  • 10 paid holidays per year
  • 80 hours of paid sick leave per year

Additional Information:

Job Posted:
March 08, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for ML Infrastructure Engineer

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

A venture-backed startup at the intersection of AI and national security is buil...
Location
Location
United States , New York City Metropolitan Area
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong engineering experience in Python, Go, or C
  • Experience building and scaling production data systems
  • Hands-on expertise with model deployment and ML Ops practices
  • Knowledge of database design, performance tuning, and operations
  • Someone who thrives in early-stage, fast-paced environments and enjoys tackling complex challenges
Job Responsibility
Job Responsibility
  • Build and maintain the data pipelines and infrastructure that power ML applications
  • Deploy and manage models at scale, from training through production
  • Design APIs and services that integrate smoothly into mission-critical workflows
  • Ensure data is handled and secured properly across large, distributed environments
  • Collaborate closely with a small, fast-moving team to solve hard technical problems in real-world settings
What we offer
What we offer
  • Significant equity
  • Strong health & wellness benefits
  • Fulltime
Read More
Arrow Right

Engineering Manager - Machine Learning Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
241200.00 - 400000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8–10 years of experience in ML infrastructure, including direct hands-on expertise as an engineer, IC/TL
  • 2+ years of experience managing infrastructure or ML platform engineers
  • Proven experience delivering and operating ML or AI infrastructure at scale
  • Solid technical depth across ML/AI infrastructure domains (e.g., feature stores, pipelines, deployment, inference, observability)
  • Demonstrated ability to drive execution on complex technical projects with cross-team stakeholders
  • Strong communication and stakeholder management skills
Job Responsibility
Job Responsibility
  • Lead and support the ML Infra team, driving project execution and ensuring delivery on key commitments
  • Build and launch Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Define and drive adoption of an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines, deployment tooling, and inference systems
  • Partner with ML product teams to understand requirements and deliver solutions that accelerate model development and iteration
  • Recruit, mentor, and develop engineers, fostering a collaborative and high-performing team culture
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

This young, early-stage start-up challenger are currently looking for a hands-on...
Location
Location
United States , New York or DC
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Startup Energy: You thrive in fast-paced environments, manage ambiguity well, and focus on what moves the needle
  • Designing and deploying intuitive, user-friendly APIs
  • Demonstrated ability to train and deploy models at scale
  • Successfully launching machine learning services, particularly those leveraging LLMs, embeddings, and inference, into production environments
  • Handling and securing large-scale production data
  • Demonstrated proficiency in Python, Go, or C
  • A proactive approach to tackling complex challenges in a fast-paced, early-stage environment
  • A passion for innovation and a collaborative spirit
Job Responsibility
Job Responsibility
  • Developing secure data sharing middleware
  • Integrating software seamlessly into the workflows of specialized professionals, ensuring secure and efficient data access throughout the asset recruitment process
  • Building, shipping and supporting mission critical services in support of the services that make up the Data platform
  • Providing solutions for the full data stack – from the data management, software development and model and deployment lifecycles
What we offer
What we offer
  • Competitive Salary + Equity
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

This young, early-stage start-up challenger is currently looking for a hands-on ...
Location
Location
United States , New York or DC
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Startup Energy: You thrive in fast-paced environments, manage ambiguity well, and focus on what moves the needle
  • Designing and deploying intuitive, user-friendly APIs
  • Demonstrated ability to train and deploy models at scale
  • Successfully launching machine learning services, particularly those leveraging LLMs, embeddings, and inference, into production environments
  • Handling and securing large-scale production data
  • Demonstrated proficiency in Python, Go, or C
  • A proactive approach to tackling complex challenges in a fast-paced, early-stage environment
  • A passion for innovation and a collaborative spirit
Job Responsibility
Job Responsibility
  • Developing secure data sharing middleware
  • Integrating software seamlessly into the workflows of specialised professionals, ensuring secure and efficient data access throughout the asset recruitment process
  • Providing solutions for the full data stack – from the data management, software development and model and deployment lifecycles
What we offer
What we offer
  • Equity
  • Opportunity to work with an Ambitious, Rapidly-Growing Start-Up
  • Fulltime
Read More
Arrow Right

Engineering Manager, Infrastructure

As an Engineering Manager for the Infrastructure team, you’ll lead the engineers...
Location
Location
Canada; United States
Salary
Salary:
195000.00 - 285000.00 USD / Year
apollo.io Logo
Apollo.io
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of hands-on software or infrastructure engineering experience
  • 2+ years of experience leading teams of senior and staff-level engineers in platform, SRE, or infrastructure domains
  • Proven ability to design and operate large-scale distributed systems in cloud environments (preferably GCP or AWS)
  • Expertise with Kubernetes, Docker, Terraform, Ubuntu, and CI/CD pipelines
  • Familiarity with observability tools (Grafana, Prometheus, ELK, Datadog, NewRelic) and performance tuning
  • Strong grounding in networking, security, and reliability principles
  • Experience managing infrastructure costs, availability SLAs, and high-throughput systems at scale
Job Responsibility
Job Responsibility
  • Lead, coach, and grow a distributed team of high-impact Infrastructure Engineers
  • Partner with senior engineering leadership on strategic initiatives such as cloud migration, infrastructure scaling, platform reliability, and cost efficiency
  • Define and implement modern operational excellence practices, including SLOs, error budgets, incident reviews, and performance monitoring
  • Guide technical decision-making across key areas like Kubernetes, GCP, observability, networking, CI/CD, and IaC (Terraform, Ansible)
  • Collaborate with AI, Data, and Product Engineering teams to ensure infrastructure scalability for ML and AI-native workloads
  • Run effective 1:1s, career development conversations, and quarterly performance reviews
  • Support recruiting efforts to attract top engineering talent across time zones
What we offer
What we offer
  • Equity
  • Company bonus or sales commissions/bonuses
  • 401(k) plan
  • At least 10 paid holidays per year
  • Flex PTO
  • Parental leave
  • Employee assistance program and wellbeing benefits
  • Global travel coverage
  • Life/AD&D/STD/LTD insurance
  • FSA/HSA and medical, dental, and vision benefits
  • Fulltime
Read More
Arrow Right

Software Engineer, Data Infrastructure

The Data Infrastructure team at Figma builds and operates the foundational platf...
Location
Location
United States , San Francisco; New York
Salary
Salary:
149000.00 - 350000.00 USD / Year
figma.com Logo
Figma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of Software Engineering experience, specifically in backend or infrastructure engineering
  • Experience designing and building distributed data infrastructure at scale
  • Strong expertise in batch and streaming data processing technologies such as Spark, Flink, Kafka, or Airflow/Dagster
  • A proven track record of impact-driven problem-solving in a fast-paced environment
  • A strong sense of engineering excellence, with a focus on high-quality, reliable, and performant systems
  • Excellent technical communication skills, with experience working across both technical and non-technical counterparts
  • Experience mentoring and supporting engineers, fostering a culture of learning and technical excellence
Job Responsibility
Job Responsibility
  • Design and build large-scale distributed data systems that power analytics, AI/ML, and business intelligence
  • Develop batch and streaming solutions to ensure data is reliable, efficient, and scalable across the company
  • Manage data ingestion, movement, and processing through core platforms like Snowflake, our ML Datalake, and real-time streaming systems
  • Improve data reliability, consistency, and performance, ensuring high-quality data for engineering, research, and business stakeholders
  • Collaborate with AI researchers, data scientists, product engineers, and business teams to understand data needs and build scalable solutions
  • Drive technical decisions and best practices for data ingestion, orchestration, processing, and storage
What we offer
What we offer
  • equity
  • health, dental & vision
  • retirement with company contribution
  • parental leave & reproductive or family planning support
  • mental health & wellness benefits
  • generous PTO
  • company recharge days
  • a learning & development stipend
  • a work from home stipend
  • cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Network Enablement (Applied ML)

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering skills including systems design, APIs, and building reliable backend services (Go or Python preferred)
  • Production experience with batch and streaming data pipelines and orchestration tools such as Airflow or Spark
  • Experience building or operating real-time scoring and online feature-serving systems, including feature stores and low-latency model inference
  • Experience integrating model outputs into product flows (APIs, feature flags) and measuring impact through experiments and product metrics
  • Experience with model lifecycle and operations: model registries, CI/CD for models, reproducible training, offline & online parity, monitoring and incident response
Job Responsibility
Job Responsibility
  • Embed model inference into Network Enablement product flows and decision logic (APIs, feature flags, backend flows)
  • Define and instrument product + ML success metrics (fraud reduction, retention lift, false positives, downstream impact)
  • Design and run experiments and rollout plans (backtesting, shadow scoring, A/B tests, feature-flagged releases) to validate product hypotheses
  • Build and operate offline training pipelines and production batch scoring for bank intelligence products
  • Ship and maintain online feature serving and low-latency model inference endpoints for real-time partner/bank scoring
  • Implement model CI/CD, model/version registry, and safe rollout/rollback strategies
  • Monitor model/data health: drift/regression detection, model-quality dashboards, alerts, and SLOs targeted to partner product needs
  • Ensure offline and online parity, data lineage, and automated validation / data contracts to reduce regressions
  • Optimize inference performance and cost for real-time scoring (batching, caching, runtime selection)
  • Ensure fairness, explainability and PII-aware handling for partner-facing ML features
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right