CrawlJobs Logo

ML Infra Engineer (Data Systems)

physicalintelligence.company Logo

Physical Intelligence

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As an ML Infra Engineer (Data Systems), you’ll build and operate the data infrastructure that powers large-scale robot learning. Your systems will sit directly between raw data sources and training/evaluation, enabling us to move faster while maintaining performance, correctness, and reliability at scale. This is a systems role at the intersection of distributed systems, storage, and machine learning infrastructure.

Job Responsibility:

  • Data Ingestion & Processing: Design and build high-throughput pipelines that validate, transform, and featurize raw multimodal data
  • Batch & Streaming Systems: Operate large-scale batch and streaming workflows over massive datasets
  • Storage Systems: Design object storage layouts, metadata systems, and efficient access patterns
  • choose file formats with performance and scalability in mind
  • Data Lifecycle Management: Build systems for backfills, dataset rebuilds, garbage collection, and large-scale transformations
  • Training-Time Performance: Optimize dataloaders, sharding, prefetching, caching, and throughput to reduce time from data arrival → model training
  • Metadata & Indexing: Build scalable metadata stores for datasets, annotations, and training artifacts
  • Data Movement: Move hundreds of terabytes to petabytes efficiently across clusters and environments
  • Operational Correctness: Implement observability, validation, and guardrails to prevent silent data regressions
  • Cross-Functional Collaboration: Work closely with cross-functional teams of researchers, engineers and roboticists to translate evolving data needs into robust systems

Requirements:

  • Strong software engineering fundamentals
  • Experience building distributed systems or large-scale data pipelines
  • Comfort reasoning about performance, memory, I/O, and storage efficiency
  • Familiarity with batch and/or streaming processing systems
  • Experience with object storage systems and data format tradeoffs
  • Ownership mindset: design, build, operate, and iterate on systems end-to-end
  • Enjoy working closely with researchers and unblocking fast-moving projects

Nice to have:

  • Experience with large ML training pipelines or dataloading systems
  • Knowledge of columnar or custom data formats
  • Experience with systems like ClickHouse, Ray, Flink, Spark, or similar
  • Hands-on experience operating petabyte-scale datasets
  • Debugging and fixing performance bottlenecks in data-heavy systems

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for ML Infra Engineer (Data Systems)

Data Engineer

Influur is redefining how advertising works, through creators, data, and AI. Our...
Location
Location
Mexico , Mexico City
Salary
Salary:
Not provided
influur.com Logo
Influur
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming with Python and SQL
  • Comfortable building from scratch and improving existing code
  • Expertise in data modeling and warehousing, including dimensional modeling and performance tuning
  • Experience designing and operating ETL and ELT pipelines with tools like Airflow or Dagster, plus dbt for transformations
  • Hands-on with batch and streaming systems and with Lakehouse or warehouse tech on AWS or GCP
  • Proficiency integrating third-party APIs and datasets, ensuring reliability, lineage, and governance
  • Familiarity with AI data needs: feature stores, embedding pipelines, vector databases, and feedback loops that close the gap between model and outcome
  • High standards for code quality, testing, observability, and CI
  • Comfortable with Docker and modern cloud infra
Job Responsibility
Job Responsibility
  • Treats data as a product and ships improvements that users feel
  • Moves fast without breaking trust
  • Owns problems across the stack, from ingestion to modeling to serving
  • Communicates clearly with ML engineers, analysts, and business partners
  • Experiments, measures, and iterates
  • Sees ambiguity as a chance to design the standard everyone else will follow
What we offer
What we offer
  • Competitive equity in a venture-backed company
  • Opportunities to grow and develop
  • Remote work
  • Fulltime
Read More
Arrow Right

Data Engineer

Influur is redefining how advertising works, through creators, data, and AI. Our...
Location
Location
Colombia , Bogotá
Salary
Salary:
Not provided
influur.com Logo
Influur
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming with Python and SQL
  • Comfortable building from scratch and improving existing code
  • Expertise in data modeling and warehousing, including dimensional modeling and performance tuning
  • Experience designing and operating ETL and ELT pipelines with tools like Airflow or Dagster, plus dbt for transformations
  • Hands-on with batch and streaming systems and with Lakehouse or warehouse tech on AWS or GCP
  • Proficiency integrating third-party APIs and datasets, ensuring reliability, lineage, and governance
  • Familiarity with AI data needs: feature stores, embedding pipelines, vector databases, and feedback loops that close the gap between model and outcome
  • High standards for code quality, testing, observability, and CI
  • Comfortable with Docker and modern cloud infra
Job Responsibility
Job Responsibility
  • Treats data as a product and ships improvements that users feel
  • Moves fast without breaking trust. You value contracts, schemas, and backward compatibility
  • Owns problems across the stack, from ingestion to modeling to serving
  • Communicates clearly with ML engineers, analysts, and business partners
  • Experiments, measures, and iterates. You set measurable SLAs and keep them green
  • Sees ambiguity as a chance to design the standard everyone else will follow
What we offer
What we offer
  • Competitive equity in a venture-backed company
  • Opportunities to grow and develop
  • Remote work
  • Fulltime
Read More
Arrow Right

Data Engineer

Influur is redefining how advertising works, through creators, data, and AI. Our...
Location
Location
Argentina , Buenos Aires
Salary
Salary:
Not provided
influur.com Logo
Influur
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming with Python and SQL
  • Comfortable building from scratch and improving existing code
  • Expertise in data modeling and warehousing, including dimensional modeling and performance tuning
  • Experience designing and operating ETL and ELT pipelines with tools like Airflow or Dagster, plus dbt for transformations
  • Hands-on with batch and streaming systems and with Lakehouse or warehouse tech on AWS or GCP
  • Proficiency integrating third-party APIs and datasets, ensuring reliability, lineage, and governance
  • Familiarity with AI data needs: feature stores, embedding pipelines, vector databases, and feedback loops that close the gap between model and outcome
  • High standards for code quality, testing, observability, and CI
  • Comfortable with Docker and modern cloud infra
Job Responsibility
Job Responsibility
  • Treats data as a product and ships improvements that users feel
  • Moves fast without breaking trust. You value contracts, schemas, and backward compatibility
  • Owns problems across the stack, from ingestion to modeling to serving
  • Communicates clearly with ML engineers, analysts, and business partners
  • Experiments, measures, and iterates. You set measurable SLAs and keep them green
  • Sees ambiguity as a chance to design the standard everyone else will follow
What we offer
What we offer
  • Competitive equity in a venture-backed company
  • Opportunities to grow and develop
  • Remote work
  • Fulltime
Read More
Arrow Right

Principal Engineer - Marketplace

Principal Engineer role in the Marketplace Engineering team to lead breakthrough...
Location
Location
United States , San Francisco; Sunnyvale
Salary
Salary:
302000.00 - 336000.00 USD / Year
uber.com Logo
Uber
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Machine Learning, Operations Research, or related quantitative field OR Master’s degree with 12+ years of industry experience
  • 10+ years of experience building and deploying ML models in large-scale production environments
  • Expert-level proficiency in modern ML frameworks (TensorFlow, PyTorch, JAX) and distributed computing platforms (Spark, Ray)
  • Deep expertise across multiple areas including: Deep Learning, Causal Inference, Reinforcement Learning, Multi-objective Optimization, Algorithmic Game Theory, and Large-scale Ads Ranking/Auction Systems
  • Proven track record of leading complex ML projects from research through production with significant measurable business impact
  • Strong programming skills in Python, Java, or Go with experience building production ML systems
  • Experience with feature engineering, model serving, and ML infrastructure at scale (handling millions of predictions per second)
  • Technical leadership experience including mentoring senior engineers and driving cross-team technical initiatives
  • Advanced Deep Learning and Neural Network architectures
  • Scalable ML architecture and distributed model training
Job Responsibility
Job Responsibility
  • Lead the design and implementation of advanced ML systems for dynamic pricing algorithms serving millions of drivers across 70+ countries around the world
  • Architect real-time ML infrastructure handling 1M+ pricing decisions per second with sub-50ms latency requirements
  • Drive breakthrough research in causal ML, reinforcement learning, algorithmic game theory, and multi-objective optimization for marketplace optimization with strategic agents
  • Own end-to-end ML model lifecycle from research through production deployment and continuous optimization
  • Develop and enforce best practices in system design, ensuring data integrity, security, and optimal performance
  • Serve as a representative for the Marketplace organization to the broader internal and external technical community
  • Contribute to the eng brand for Marketplace and serve as a talent magnet to help attract and retain talent for the team
  • Stay abreast of industry trends and emerging technologies in software engineering, focused particularly on ML/AI, to enhance our systems and processes continually
  • Build scalable ML architecture and feature management systems supporting Driver Pricing and broader Marketplace teams
  • Design experimentation frameworks enabling rapid testing of pricing algorithms using A/B, Switchback, Synthetic Control, and other experimental methodologies
What we offer
What we offer
  • Eligible to participate in Uber's bonus program
  • May be offered an equity award & other types of comp
  • Eligible to participate in a 401(k) plan
  • Eligible for various benefits (details at provided link)
  • Fulltime
Read More
Arrow Right

Staff Data Engineer

At Vanta, our mission is to help businesses earn and prove trust. We believe tha...
Location
Location
United States
Salary
Salary:
213000.00 - 251000.00 USD / Year
vanta.com Logo
Vanta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Have at least six years of experience working with data
  • Have at least two years of experience in Software Engineering or a related field
  • Have experience with common analytics tooling (e.g. Stitch/Fivetran, Snowflake/BigQuery/Redshift, dbt, Airflow, Dagster, Looker/Mode/Sigma)
  • Have led an implementation of Debezium or another CDC event tracking system
  • Have good working knowledge of AWS data infra systems and Terraform
  • Bring a system-oriented and software engineering mindset to the Data Engineering practice
  • Deep knowledge of crafting dimensional and fact models in modern data fashion
  • Have a passion for enabling the developer experience of data
  • Desire to lead the industry in security, anonymization, and compliance management when it comes to data warehousing
  • Open to using AI to amplify their skills and strengthen their work
Job Responsibility
Job Responsibility
  • Design and implement complex data models, modeling metadata, building reports and dashboards and creating reporting tools for data science and ML products users
  • Design and deploy data infrastructure needed to drive data-driven decision-making solutions
  • Be the company’s expert on data administration and master data management
  • Be a technical thought leader on the development of scalable data systems
  • Develop front end applications to expose analytical data sets enterprise wide
  • Write highly tuned, scalable SQL queries running over large-scale, heterogeneous data warehouses
  • Work with the Product and Enterprise Engineering system teams to structure source systems for reporting consumption across the enterprise
  • Help maintain the CDC pipeline to power customer reporting
What we offer
What we offer
  • Offers Equity
  • Medical benefits
  • 401(k) plan
  • Other company perk programs
  • Comprehensive medical, dental, and vision coverage, with 100% of employee-only benefit premiums covered for most medical plans
  • 16 weeks fully-paid Parental Leave for all new parents
  • Health & wellness stipend
  • Remote workspace, internet, and cellphone stipend
  • Commuter benefits for team members who report to the SF and NYC office
  • Family planning benefits
  • Fulltime
Read More
Arrow Right

Data Platform Engineer

Our enterprise clients are moving from fragmented data foundations to AI-first d...
Location
Location
Switzerland , Genève
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
April 30, 2026
Flip Icon
Requirements
Requirements
  • Strong background in data engineering for large-scale systems
  • Proven experience delivering production-grade data pipelines
  • Familiarity with enterprise data landscapes and constraints
  • Engineering-first approach to data
  • Strong ownership and accountability for data reliability
  • Comfortable operating in complex, multi-stakeholder environments
  • High standards for robustness, scalability, and maintainability
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable batch and streaming data pipelines
  • Implement data ingestion from heterogeneous enterprise sources (databases, APIs, events, files)
  • Structure data for downstream AI and ML consumption
  • Ensure data quality, consistency, and availability across environments
  • Build and operate feature stores and analytical data layers for ML
  • Design data models optimized for training and inference workloads
  • Enable efficient data access patterns for real-time and near-real-time AI use cases
  • Support experimentation while enforcing production-grade standards
  • Translate business and AI requirements into robust data architectures
  • Collaborate closely with AI/ML Engineers, MLOps, Infra, Security, and Product teams
  • Fulltime
Read More
Arrow Right

Machine Learning Data Engineer - Systems & Retrieval

As a Machine Learning Data Engineer - Systems & Retrieval, you will build and op...
Location
Location
United States , Palo Alto
Salary
Salary:
Not provided
zyphra.com Logo
Zyphra
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong software engineering background with fluency in Python
  • Experience designing, building, and maintaining data pipelines in production environments
  • Deep understanding of data structures, storage formats, and distributed data systems
  • Familiarity with indexing and retrieval techniques for large-scale document corpora
  • Understanding of database systems (SQL and NoSQL), their internals, and performance characteristics
  • Strong attention to security, access controls, and compliance best practices (e.g., GDPR, SOC2)
  • Excellent debugging, observability, and logging practices to support reliability at scale
  • Strong communication skills and experience collaborating across ML, infra, and product teams
Job Responsibility
Job Responsibility
  • Design and implementation of distributed data ingestion and transformation pipelines
  • Building retrieval and indexing systems that support RAG and other LLM-based methods
  • Mining and organizing large unstructured datasets, both in research and production environments
  • Collaborating with ML engineers, systems engineers, and DevOps to scale pipelines and observability
  • Ensuring compliance and access control in data handling, with security and auditability in mind
What we offer
What we offer
  • Comprehensive medical, dental, vision, and FSA plans
  • Competitive compensation and 401(k)
  • Relocation and immigration support on a case-by-case basis
  • On-site meals prepared by a dedicated culinary team
  • Thursday Happy Hours
  • Fulltime
Read More
Arrow Right

Senior CVML Platform Engineer

We are seeking a Senior CVML Platform Engineer to help design, build, and evolve...
Location
Location
United States
Salary
Salary:
160000.00 - 287000.00 USD / Year
bluerivertechnology.com Logo
Blue River Technology
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional engineering experience, with a focus on platform, infrastructure, or systems engineering
  • Strong technical judgment, balancing the evolution of legacy platforms with the design and delivery of new, greenfield components shared across multiple teams and workloads
  • Excellent Python skills, used in production systems, tooling, and platform components
  • Solid understanding of ML systems and the end-to-end model development lifecycle, from experimentation to deployment and iteration
  • Hands-on experience or strong familiarity with cloud platforms (AWS preferred) and container orchestration systems such as Kubernetes and Slurm
  • Ability to partner effectively with ML engineers, infra teams, and product stakeholders to translate requirements into platform capabilities
  • Ability to quickly ramp up on new domains, tools, and complex existing systems
Job Responsibility
Job Responsibility
  • Design, build, and evolve platform capabilities that support ML training, batch inference, and model deployment workflows at scale
  • Own and improve core platform components (e.g., compute orchestration, data pipelines, inference systems) used by multiple teams across Blue River and John Deere
  • Continuously enhance platform reliability, scalability, and performance, with a focus on real-world ML workloads
  • Enable ML engineers to move faster by building intuitive, well-documented platform tools and workflows across the model lifecycle (experimentation, deployment, and iteration)
  • Improve model inference performance and throughput while balancing trade-offs among cost, latency, and reliability
  • Support and scale distributed training and inference systems, including frameworks such as Ray and related tooling
  • Develop and optimize hybrid compute environments (cloud + on-prem/GPU infrastructure) to support large-scale ML workloads
  • Build and maintain infrastructure leveraging Kubernetes, Slurm, and cloud platforms (AWS preferred)
  • Identify and resolve bottlenecks in compute, storage, and data movement pipelines
  • Evaluate existing platform systems and make thoughtful decisions on when to extend, refactor, or rebuild components
What we offer
What we offer
  • bonus and benefit programs
  • Fulltime
Read More
Arrow Right