CrawlJobs Logo

AI Research Engineer, Data Infrastructure

United States, Palo Alto 180000.00 - 250000.00 USD / Year · Job Posted December 14, 2025
Apply Position
Job Link Share

Job Description

As a Research Engineer in Infrastructure, you will design and implement a robust data engine to manage the data collected by our humanoid robot fleet. You’ll be responsible for making this data easily accessible for querying and training purposes. Your work will support high-quality data pipelines, enabling efficient model development, large-scale annotation, and integration across robotic, on-premise, and cloud systems.

Job Responsibility

  • Optimize operational efficiency of data collection across the NEO robot fleet
  • Design intelligent triggers to determine when and what data should be uploaded from the robots
  • Automate ETL pipelines to make fleet-wide data easily queryable and training-ready
  • Collaborate with external dataset providers to prepare diverse multi-modal pre-training datasets
  • Build frontend tools for visualizing and automating the labeling of large datasets
  • Develop machine learning models for automatic dataset labeling and organization

Requirements

  • Strong experience in building data pipelines and ETL systems
  • Ability to design and implement systems for data collection and management from robotic fleets
  • Familiarity with architectures that span on-robot components, on-premise clusters, and cloud infrastructure
  • Experience with data labeling tools or building dataset visualization and annotation tooling
  • Proficiency in creating or applying machine learning models for dataset organization and automated labeling

What we offer

  • Equity
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

AI Research Engineer, Data Infrastructure

8 matching positions

AI Research Engineer, Data Infrastructure

As a Research Engineer in Infrastructure, you will design and implement a robust...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 250000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience in building data pipelines and ETL systems
  • Ability to design and implement systems for data collection and management from robotic fleets
  • Familiarity with architectures that span on-robot components, on-premise clusters, and cloud infrastructure
  • Experience with data labeling tools or building dataset visualization and annotation tooling
  • Proficiency in creating or applying machine learning models for dataset organization and automated labeling
Job Responsibility
Job Responsibility
  • Optimize operational efficiency of data collection across the NEO robot fleet
  • Design intelligent triggers to determine when and what data should be uploaded from the robots
  • Automate ETL pipelines to make fleet-wide data easily queryable and training-ready
  • Collaborate with external dataset providers to prepare diverse multi-modal pre-training datasets
  • Build frontend tools for visualizing and automating the labeling of large datasets
  • Develop machine learning models for automatic dataset labeling and organization
What we offer
What we offer
  • Equity
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

Research Engineer, Data Infrastructure

As a Research Engineer in Data Infrastructure, you will design and implement a “...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 250000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience in building data pipelines and ETL systems
  • Ability to design and implement systems that collect, upload, and manage data from robotic fleets
  • Familiarity with architectures combining on‑robot components, on‑premises clusters, and cloud systems
  • Experience with data labeling tools or building tooling for dataset visualization and annotation
  • Skills in creating or applying machine learning models for dataset organization / automated labeling
Job Responsibility
Job Responsibility
  • Optimize operational efficiency of data collection on the NEO fleet
  • Design triggers on the robot to determine if and when data should be uploaded
  • Automate ETL pipelines so fleet‑wide data is easily queryable and available for training
  • Work with external dataset providers to prepare diverse multi-modal pre-training datasets
  • Build frontend tools for visualizing and automating labeling of very large datasets
  • Develop machine learning models to automatically label and organize datasets
What we offer
What we offer
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

AI Research Infrastructure Engineer

Block is scaling Customer Insights into an AI-powered insights accelerator that ...
Location
Location
United States , Bay Area
Salary
Salary:
168300.00 - 297000.00 USD / Year
cash.app Logo
Cash App
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in research, automation implementation, analytics, or related technical fields with hands-on workflow optimization experience
  • 3+ years implementing AI/ML solutions, with experience in automation, LLM integration, or applied AI/analytics workflows
  • Hands-on technical skills in programming languages (Python, R, SQL) for automation development, API/MCP integrations, cloud platforms, and research data pipeline creation
  • Experience with research and analytic platforms and tools (Qualtrics, Snowflake, etc) or transferable experience with analytics and automation platforms
  • Strong technical communication and translation skills with ability to make complex AI/ML concepts, data architecture decisions, and automation workflows accessible and actionable for researchers, product managers, and business stakeholders
  • Proven ability to build stakeholder confidence and alignment during technology transformation
  • Strong project management skills with ability to coordinate multiple complex automation initiatives, manage competing priorities, and deliver measurable operational efficiency gains (reduced cycle times, improved quality outcomes, increased research capacity)
  • Familiarity with financial services, fintech, or payments industry research contexts and regulatory requirements preferred
Job Responsibility
Job Responsibility
  • Design, build, and deploy AI agents and agentic workflows that automate research operations from study design through insights delivery, using LLMs, prompt engineering, MCP (Model Context Protocol) integrations, and workflow orchestration integrated with existing research and analytics tech stack
  • Design, build, and maintain automated data pipelines that ingest, transform, and unify research data from diverse sources (surveys, transcripts, analytics, behavioral logs) into AI-ready repositories with RAG capabilities for instant insight access via tools like Goose
  • Architect ETL/ELT frameworks using Python, SQL or equivalent tools to ensure data consistency, traceability, and scalability
  • Develop data models and schemas for research metadata, participant data, and AI-generated insights to support efficient querying and analysis
  • Design and prototype research automation systems using AI/ML techniques, partnering with design & engineering teams to productionize solutions
  • Partner with engineering, design, and platform teams to integrate research automation systems with Block's tech stack (i.e. Goose, GitHub, etc.) and establish governance frameworks for quality, ethics, and compliance
  • Mentor team members on AI agent development, agentic system design, and research automation best practices to build organizational capabilities in intelligent automation
What we offer
What we offer
  • Remote work
  • medical insurance
  • flexible time off
  • retirement savings plans
  • modern family planning
  • Fulltime
Read More
Arrow Right

Senior Machine Learning Engineer (Research Scientist) - Data Foundation & AI

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , New York
Salary
Salary:
228960.00 - 315360.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong applied ML research skills with production delivery experience
  • Depth in Transformers/LLMs, representation learning, or large-scale model training
  • Demonstrated ability to ship models to production (not just prototype)
  • Distributed training experience and strong Python + software engineering fundamentals
  • Fintech / financial data domain experience is a plus
  • External publications or open-source contributions is a plus
Job Responsibility
Job Responsibility
  • Building a foundation model on one of the world’s richest financial datasets that no one else has
  • Doing research that ships: moving from experimentation and prototypes to production systems serving real customers
  • Working across the full ML stack, from pretraining objectives and architectures to serving infrastructure and monitoring
  • Collaborating with a high-caliber team and seeing your work amplify the capabilities of multiple product teams
  • Helping hundreds of millions of consumers achieve greater financial freedom through data-driven products
  • Fulltime
Read More
Arrow Right

Research Engineer, Text Data Research - MSL FAIR

Meta is seeking AI research engineers to help us build the data foundation for M...
Location
Location
United States , Menlo Park
Salary
Salary:
257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 2+ years of industry research experience in LLM/NLP or related AI/ML models
  • Experience as a formal technical lead, leading major technical initiatives with cross-functional impact, and/or influencing strategy across multiple teams
  • Practical experience with pre-training or mid-training data curation for large foundational models and experience working with organic, synthetic, agentic, or reasoning data for LLMs
  • Demonstrated data infrastructure and software background, and experience building data tooling and services
  • Published research in leading peer-reviewed conferences (e.g., NeurIPS, ICML, ICLR, ACL, EMNLP) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Architect efficient and scalable data curation systems and pipelines
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in agentic data, synthetic data, reasoning data, web parser, coding data, data scaling laws, or datamix optimization
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Research Engineer, Media Data Research - MSL FAIR

Meta is seeking AI research engineers to help us build the data foundation for M...
Location
Location
United States , Menlo Park
Salary
Salary:
217000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 1+ year of industry research experience in LLM/LMM, computer vision, or related AI/ML models
  • Experience owning and/or driving complex technical projects from end-to-end
  • Practical experience with multimodal pre-training or mid-training data curation for large media perception or generation models
  • Demonstrated data infrastructure and software background, and experience building data tooling and services
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Architect efficient and scalable data curation systems and pipelines
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, data scaling laws, or data mixing
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Research Engineer, Media Data Research - MSL FAIR

Meta is seeking AI research engineers to help us build the data foundation for M...
Location
Location
United States , Menlo Park
Salary
Salary:
257000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • 2+ years of industry research experience in LLM/NLP, computer vision, or related AI/ML models
  • Experience as a formal technical lead, leading major technical initiatives with cross-functional impact, and/or influencing strategy across multiple teams
  • Practical experience with multimodal pre-training or mid-training data curation for large media perception or generation models
  • Demonstrated data infrastructure and software background, and experience building data tooling and services
  • Published research in leading peer-reviewed conferences (e.g., ACL, NeurIPS, ICML, ICLR, AAAI, KDD, CVPR, ICCV) and/or demonstrated significant industry influence in the field of AI
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to develop Meta’s next foundational models
  • Architect efficient and scalable data curation systems and pipelines
  • Fundamentally improve our data velocity across workflows and projects by contributing to the advancement of data tooling
  • Execute on high priority projects in pre-training, mid-training, or post-training data curation
  • Apply specialized expertise in video/image generation, video/image perception, OCR, data scaling laws, or data mixing
  • Lead complex technical projects end-to-end
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Data Research Engineer

Fundamental is an AI company pioneering the future of enterprise decision-making...
Location
Location
Spain , Barcelona
Salary
Salary:
Not provided
Fundamental
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience with: Identifying good data sources to train and evaluate ML models, including real-world and realistic synthetic data sources
  • Bringing data from structured and unstructured sources, as well as simulators and causal models, into formats accessible by ML models
  • Strong fundamentals of software engineering
  • Strong knowledge of: Python
  • Python data processing stack (numpy, pandas, …)
  • Familiarity with: distributed processing (e.g. Ray, Dask Spark, Beam)
  • data storage solutions
  • Basic ML knowledge
Job Responsibility
Job Responsibility
  • Helping to identify, characterize and evaluate data sources, including realistic synthetic data generated from Structured Causal Models and physical / systems-based simulators
  • Building and maintaining ETL pipelines
  • Designing and implementing scalable, reliable data storage solutions
  • Collaborating with the rest of the research team to maintain a reliable, efficient training pipeline where data is a critical component
  • Collaborating with the wider engineering and infrastructure team
What we offer
What we offer
  • Competitive compensation with salary and equity
  • Comprehensive health coverage for you and your dependents
  • Paid parental leave for all new parents, inclusive of adoptive and surrogate journeys
  • Relocation support for employees moving to join the team in one of our office locations
  • A mission-driven, low-ego culture that values diversity of thought, ownership, and bias toward action
  • Fulltime
Read More
Arrow Right