CrawlJobs Logo

Software Engineer - Data Infrastructure

lumalabs.ai Logo

Luma AI

Location Icon

Location:
United States; United Kingdom , Palo Alto

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Data Infrastructure Engineer at Luma, you will play a critical role in building and scaling the data infrastructure that supports our cutting-edge multimodal AI systems. Your work will focus on developing high-throughput, large-scale data processing pipelines tailored for machine learning research and internal ML platform needs. You will collaborate closely with ML researchers and product teams to create reliable, efficient, and easy-to-use data infrastructure that empowers innovation and accelerates development. This role requires a strong foundation in distributed systems and data engineering, with an emphasis on supporting complex machine learning workflows rather than traditional product data infrastructure.

Job Responsibility:

  • Build and maintain scalable data infrastructure for high-throughput machine learning workflows
  • Collaborate with ML researchers and product teams to ensure data systems meet evolving needs
  • Develop and optimize large-scale data pipelines and batch processing jobs
  • Contribute to the architecture and implementation of reliable, high-performance data platforms
  • Integrate open-source tools and continuously improve data infrastructure through monitoring and tuning
  • Participate in cross-functional projects to improve data reliability, scalability, and operational excellence
  • Support the evaluation and adoption of new programming languages and frameworks relevant to data infrastructure
  • Engage in continuous improvement of data infrastructure through monitoring, troubleshooting, and performance tuning
  • Collaborate with research & engineering teams to help define and refine best practices for data infrastructure development

Requirements:

  • Proficiency in Python (or similar languages with willingness to learn Python) and experience with large-scale, high-throughput data infrastructure
  • Familiarity with distributed computing frameworks (e.g., Ray, Spark, Beam)
  • Ability to design and optimize data pipelines for ML research and internal teams
  • Strong problem-solving skills and understanding of data engineering at scale
  • Collaborative, product-focused mindset
  • comfortable in fast-paced environments
  • Experience sourcing, integrating, and optimizing data from diverse and large datasets
  • Comfortable working in a fast-paced, product-focused environment with a strong execution mindset
  • Open to candidates across seniority levels, from mid-level individual contributors to senior engineers and managers.

Nice to have:

  • Prior experience working with complex data infrastructure or AI/ML platforms highly desirable
  • Experience with open source data infrastructure projects is a plus

Additional Information:

Job Posted:
January 13, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Engineer - Data Infrastructure

Senior Principal Data Platform Software Engineer

We’re looking for a Sr Principal Data Platform Software Engineer (P70) to be a k...
Location
Location
Salary
Salary:
239400.00 - 312550.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years in Data Engineering, Software Engineering, or related roles, with substantial exposure to big data ecosystems
  • Demonstrated experience building and operating data platforms or large‑scale data services in production
  • Proven track record of building services from the ground up (requirements → design → implementation → deployment → ongoing ownership)
  • Hands‑on experience with AWS, GCP (e.g., compute, storage, data, and streaming services) and cloud‑native architectures
  • Practical experience with big data technologies, such as Databricks, Apache Spark, AWS EMR, Apache Flink, or StarRocks
  • Strong programming skills in one or more of: Kotlin, Scala, Java, Python
  • Experience leading cross‑team technical initiatives and influencing senior stakeholders
  • Experience mentoring Staff/Principal engineers and lifting the technical bar for a team or org
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Design, develop and own delivery of high quality big data and analytical platform solutions aiming to solve Atlassian’s needs to support millions of users with optimal cost, minimal latency and maximum reliability
  • Improve and operate large‑scale distributed data systems in the cloud (primarily AWS, with increasing integration with GCP and Kubernetes‑based microservices)
  • Drive the evolution of our high-performance analytical databases and its integrations with products, cloud infrastructures (AWS and GCP) and isolated cloud environments
  • Help define and uplift engineering and operational standards for petabyte scale data platforms, with sub‑second analytic queries and multi‑region availability (coding guidelines, code review practices, observability, incident response, SLIs/SLOs)
  • Partner across multiple product and platform teams (including Analytics, Marketplace/Ecosystem, Core Data Platform, ML Platform, Search, and Oasis/FedRAMP) to deliver company‑wide initiatives that depend on reliable, high‑quality data
  • Act as a technical mentor and multiplier, raising the bar on design quality, code quality, and operational excellence across the broader team
  • Design and implement self‑healing, resilient data platforms with strong observability, fault tolerance, and recovery characteristics
  • Own the long‑term architecture and technical direction of Atlassian’s product data platform with projects that are directly tied to Atlassian’s company-level OKRs
  • Be accountable for the reliability, cost efficiency, and strategic direction of Atlassian’s product analytical data platform
  • Partner with executives and influence senior leaders to align engineering efforts with Atlassian’s long-term business objectives
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
  • Fulltime
Read More
Arrow Right

Senior Rust Software Engineer - Data Classification

As a Senior Engineer on the Data Classification team, you’ll design and developm...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
paloaltonetworks.com Logo
Palo Alto Networks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • BSc in Computer Science with 5+ years, or MSc with 3+ years, or equivalent military experience
  • Experience with systems-level languages (C++, C, Go, etc.)
  • Deep understanding of memory management
  • Strong understanding of application and OS interaction
  • Experience with multi-threaded and multi-process development with a performance focus
  • Familiarity with CI/CD pipelines and cloud infra - Have the ability to “make stuff work" on top of writing good code
Job Responsibility
Job Responsibility
  • Develop solutions for data security and classification using Rust, Python & Golang
  • Contribute to feature development (design, implementation, testing, deployment)
  • Collaborate with cross-functional teams for product and infrastructure integration
  • Innovate solutions for high-scale data operations
  • Serve as a leader, improving the work of others
  • Generate ideas and participate in brainstorming
  • Identify and push for team improvements
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Data Platform

We are looking for a foundational member of the Data Team to enable Skydio to ma...
Location
Location
United States , San Mateo
Salary
Salary:
180000.00 - 240000.00 USD / Year
skydio.com Logo
Skydio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience
  • 2+ years in software engineering
  • 2+ years in data engineering with a bias towards getting your hands dirty
  • Deep experience with Databricks building pipelines, managing datasets, and developing dashboards or analytical applications
  • Proven track record of operating scalable data platforms, defining company-wide patterns that ensure reliability, performance, and cost effectiveness
  • Proficiency in SQL and at least one modern programming language (we use Python)
  • Comfort working across the full data stack — from ingestion and transformation to orchestration and visualization
  • Strong communication skills, with the ability to collaborate effectively across all levels and functions
  • Demonstrated ability to lead technical direction, mentor teammates, and promote engineering excellence and best practices across the organization
  • Familiarity with AI-assisted data workflows, including tools that accelerate data transformations or enable natural-language interfaces for analytics
Job Responsibility
Job Responsibility
  • Design and scale the data infrastructure that ingests live telemetry from tens of thousands of autonomous drones
  • Build and evolve our Databricks and Palantir Foundry environments to empower every Skydian to query data, define jobs, and build dashboards
  • Develop data systems that make our products truly data-driven — from predictive analytics that anticipate hardware failures, to 3D connectivity mapping, to in-depth flight telemetry analysis
  • Create and integrate AI-powered tools for data analysis, transformation, and pipeline generation
  • Champion a data-driven culture by defining and enforcing best practices for data quality, lineage, and governance
  • Collaborate with autonomy, manufacturing, and operations teams to unify how data flows across the company
  • Lead and mentor data engineers, analysts, and stakeholders across Skydio
  • Ensure platform reliability by implementing robust monitoring, observability, and contributing to the on-call rotation for critical data systems
What we offer
What we offer
  • Equity in the form of stock options
  • Comprehensive benefits packages
  • Relocation assistance may also be provided for eligible roles
  • Paid vacation time
  • Sick leave
  • Holiday pay
  • 401K savings plan
  • Fulltime
Read More
Arrow Right

Software Engineer, Data Engineering

Join us in building the future of finance. Our mission is to democratize finance...
Location
Location
Canada , Toronto
Salary
Salary:
124000.00 - 145000.00 CAD / Year
robinhood.com Logo
Robinhood
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of professional experience building end-to-end data pipelines
  • Hands-on software engineering experience, with the ability to write production-level code in Python for user-facing applications, services, or systems (not just data scripting or automation)
  • Expert at building and maintaining large-scale data pipelines using open source frameworks (Spark, Flink, etc)
  • Strong SQL (Presto, Spark SQL, etc) skills
  • Experience solving problems across the data stack (Data Infrastructure, Analytics and Visualization platforms)
  • Expert collaborator with the ability to democratize data through actionable insights and solutions
Job Responsibility
Job Responsibility
  • Help define and build key datasets across all Robinhood product areas. Lead the evolution of these datasets as use cases grow
  • Build scalable data pipelines using Python, Spark and Airflow to move data from different applications into our data lake
  • Partner with upstream engineering teams to enhance data generation patterns
  • Partner with data consumers across Robinhood to understand consumption patterns and design intuitive data models
  • Ideate and contribute to shared data engineering tooling and standards
  • Define and promote data engineering best practices across the company
What we offer
What we offer
  • bonus opportunities
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Data Engineering

Join us in building the future of finance. Our mission is to democratize finance...
Location
Location
United States , Menlo Park
Salary
Salary:
146000.00 - 198000.00 USD / Year
robinhood.com Logo
Robinhood
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience building end-to-end data pipelines
  • Hands-on software engineering experience, with the ability to write production-level code in Python for user-facing applications, services, or systems (not just data scripting or automation)
  • Expert at building and maintaining large-scale data pipelines using open source frameworks (Spark, Flink, etc)
  • Strong SQL (Presto, Spark SQL, etc) skills
  • Experience solving problems across the data stack (Data Infrastructure, Analytics and Visualization platforms)
  • Expert collaborator with the ability to democratize data through actionable insights and solutions
Job Responsibility
Job Responsibility
  • Help define and build key datasets across all Robinhood product areas. Lead the evolution of these datasets as use cases grow
  • Build scalable data pipelines using Python, Spark and Airflow to move data from different applications into our data lake
  • Partner with upstream engineering teams to enhance data generation patterns
  • Partner with data consumers across Robinhood to understand consumption patterns and design intuitive data models
  • Ideate and contribute to shared data engineering tooling and standards
  • Define and promote data engineering best practices across the company
What we offer
What we offer
  • Market competitive and pay equity-focused compensation structure
  • 100% paid health insurance for employees with 90% coverage for dependents
  • Annual lifestyle wallet for personal wellness, learning and development, and more
  • Lifetime maximum benefit for family forming and fertility benefits
  • Dedicated mental health support for employees and eligible dependents
  • Generous time away including company holidays, paid time off, sick time, parental leave, and more
  • Lively office environment with catered meals, fully stocked kitchens, and geo-specific commuter benefits
  • Bonus opportunities
  • Equity
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Core Data

As a Senior Software Engineer on our Core Data team, you will take a leading rol...
Location
Location
United States
Salary
Salary:
190000.00 - 220000.00 USD / Year
pomelocare.com Logo
Pomelo Care
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience building high-quality, scalable data systems and pipelines
  • Expert-level proficiency in SQL and Python, with a deep understanding of data modeling and transformation best practices
  • Hands-on experience with dbt for data transformation and Dagster, Beam, Dataflow or similar tools for pipeline orchestration
  • Experience with modern data stack tools and cloud platforms, with a strong understanding of data warehouse design principles
  • A track record of delivering elegant and maintainable solutions to complex data problems that drive real business impact
Job Responsibility
Job Responsibility
  • Build and maintain elegant data pipelines that orchestrate ingestion from diverse sources and normalize data for company-wide consumption
  • Lead the design and development of robust, scalable data infrastructure that enables our clinical and product teams to make data-driven decisions, using dbt, Dagster, Beam and Dataflow
  • Write clean, performant SQL and Python to transform raw data into actionable insights that power our platform
  • Architect data models and transformations that support both operational analytics and new data-driven product features
  • Mentor other engineers, providing technical guidance on data engineering best practices and thoughtful code reviews, fostering a culture of data excellence
  • Collaborate with product, clinical and analytics teams to understand data needs and ensure we are building infrastructure that unlocks the most impactful insights
  • Optimize data processing workflows for performance, reliability and cost-effectiveness
What we offer
What we offer
  • Competitive healthcare benefits
  • Generous equity compensation
  • Unlimited vacation
  • Membership in the First Round Network (a curated and confidential community with events, guides, thousands of Q&A questions, and opportunities for 1-1 mentorship)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Data Infrastructure

We build the data and machine learning infrastructure to enable Plaid engineers ...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience
  • Extensive hands-on software engineering experience, with a strong track record of delivering successful projects within the Data Infrastructure or Platform domain at similar or larger companies
  • Deep understanding of one of: ML Infrastructure systems, including Feature Stores, Training Infrastructure, Serving Infrastructure, and Model Monitoring OR Data Infrastructure systems, including Data Warehouses, Data Lakehouses, Apache Spark, Streaming Infrastructure, Workflow Orchestration
  • Strong cross-functional collaboration, communication, and project management skills, with proven ability to coordinate effectively
  • Proficiency in coding, testing, and system design, ensuring reliable and scalable solutions
  • Demonstrated leadership abilities, including experience mentoring and guiding junior engineers
Job Responsibility
Job Responsibility
  • Contribute towards the long-term technical roadmap for data-driven and machine learning iteration at Plaid
  • Leading key data infrastructure projects such as improving ML development golden paths, implementing offline streaming solutions for data freshness, building net new ETL pipeline infrastructure, and evolving data warehouse or data lakehouse capabilities
  • Working with stakeholders in other teams and functions to define technical roadmaps for key backend systems and abstractions across Plaid
  • Debugging, troubleshooting, and reducing operational burden for our Data Platform
  • Growing the team via mentorship and leadership, reviewing technical documents and code changes
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • equity and/or commission
  • Fulltime
Read More
Arrow Right