CrawlJobs Logo

Software Engineer, Data Infrastructure

United States, San Francisco 149000.00 - 350000.00 USD / Year · Job Posted December 08, 2025
Apply Position
Job Link Share

Job Description

The Data Infrastructure team at Figma builds and operates the foundational platforms that power analytics, AI, and data-driven decision-making across the company. We serve a diverse set of stakeholders, including AI Researchers, Machine Learning Engineers, Data Scientists, Product Engineers, and business teams that rely on data for insights and strategy. Our team owns and scales critical data platforms such as the Snowflake data warehouse, ML Datalake, and large-scale data movement and processing applications, managing all data flowing into and out of these platforms.

Job Responsibility

  • Design and build large-scale distributed data systems that power analytics, AI/ML, and business intelligence
  • Develop batch and streaming solutions to ensure data is reliable, efficient, and scalable across the company
  • Manage data ingestion, movement, and processing through core platforms like Snowflake, our ML Datalake, and real-time streaming systems
  • Improve data reliability, consistency, and performance, ensuring high-quality data for engineering, research, and business stakeholders
  • Collaborate with AI researchers, data scientists, product engineers, and business teams to understand data needs and build scalable solutions
  • Drive technical decisions and best practices for data ingestion, orchestration, processing, and storage

Requirements

  • 5+ years of Software Engineering experience, specifically in backend or infrastructure engineering
  • Experience designing and building distributed data infrastructure at scale
  • Strong expertise in batch and streaming data processing technologies such as Spark, Flink, Kafka, or Airflow/Dagster
  • A proven track record of impact-driven problem-solving in a fast-paced environment
  • A strong sense of engineering excellence, with a focus on high-quality, reliable, and performant systems
  • Excellent technical communication skills, with experience working across both technical and non-technical counterparts
  • Experience mentoring and supporting engineers, fostering a culture of learning and technical excellence

Nice to have

  • Experience with data governance, access control, and cost optimization strategies for large-scale data platforms
  • Familiarity with our stack, including Golang, Python, SQL, frameworks such as dbt, and technologies like Spark, Kafka, Snowflake, and Dagster
  • Experience designing data infrastructure for AI/ML pipelines
  • The ability to navigate ambiguity, take ownership, and drive projects from inception to execution

What we offer

  • equity
  • health, dental & vision
  • retirement with company contribution
  • parental leave & reproductive or family planning support
  • mental health & wellness benefits
  • generous PTO
  • company recharge days
  • a learning & development stipend
  • a work from home stipend
  • cell phone reimbursement
  • annual bonus plan for eligible non-sales roles

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Software Engineer, Data Infrastructure

8 matching positions

Software Engineer, Data Infrastructure

Data Platform at OpenAI owns the foundational data stack powering critical produ...
Location
Location
United States , San Francisco
Salary
Salary:
185000.00 - 385000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years in data infrastructure engineering OR 4+ years in infrastructure engineering with a strong interest in data
  • Take pride in building and operating scalable, reliable, secure systems
  • Comfortable with ambiguity and rapid change
  • Intrinsic desire to learn and fill in missing skills
  • Strong talent for sharing learnings clearly and concisely with others
  • Supported Spark, Kafka, Flink, Airflow, Trino, or Iceberg as platforms
  • Well-versed in infrastructure tooling like Terraform
  • Experienced in debugging large-scale distributed systems
  • Excited about solving data infrastructure problems in the AI space
Job Responsibility
Job Responsibility
  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and security
  • Ensure our data platform can scale by orders of magnitude while remaining reliable and efficient
  • Accelerate company productivity by empowering your fellow engineers & teammates with excellent data tooling and systems
  • Collaborate with product, research and analytics teams to build the technical foundations capabilities that unlock new features and experiences
  • Own the reliability of the systems you build, including participation in an on-call rotation for critical incidents
  • Take full lifecycle ownership: architecture, implementation, production operations, and on-call participation
  • Scale and harden big data compute and storage platforms
  • Build and support high-throughput streaming systems
  • Build and operate low latency data ingestions
  • Enable secure and governed data access for ML and analytics
What we offer
What we offer
  • Offers Equity
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Fulltime
Read More
Arrow Right

Senior .NET Software Engineer (Data Infrastructure)

At the core of Bentley's global infrastructure solutions lies a critical data se...
Location
Location
Lithuania , Vilnius; Kaunas
Salary
Salary:
4000.00 EUR / Month
bentley.com Logo
Bentley Systems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A Bachelor’s degree in Computer Science, Software Engineering, or a related field
  • At least 5 years of proven experience in software development with C#, .NET Core, and a strong understanding of OOP, data structures, and test frameworks
  • Expert-level, hands-on experience with major object storage platforms (Azure Blob Storage, Google Cloud Storage, AWS S3). This must include deep knowledge of storage usage optimization, lifecycle policies, and designing cost-efficient data access patterns
  • Solid experience with Docker and Kubernetes for deploying and managing containerized applications
  • Proven ability to write clean, maintainable, testable, and secure code, with an intuitive understanding of the long-term impact of architectural decisions
  • A self-motivated and proactive mindset, with the ability to work effectively as an individual contributor and as part of a high-performing team in an Agile/Scrum environment
  • Strong verbal and written communication skills in English
Job Responsibility
Job Responsibility
  • Architecting for Scale & Stability: Design and develop robust, event-driven cloud services and core components, with a primary focus on stability, performance, and long-term maintainability
  • Modernizing Our Storage Solutions: Implement and optimize solutions using the latest cloud object storage technologies (Google Cloud storage, Azure Blob, AWS S3, etc.) to enhance performance and cost-efficiency
  • Hands-On Implementation: Use the latest .NET development tools to turn complex architectural designs into high-quality, production-ready software
  • Championing DevOps & Automation: Develop and utilize fully automated CI/CD pipelines to deliver both application and infrastructure changes seamlessly and safely into production
  • Driving Technical Excellence: Mentor and share your deep expertise with colleagues, elevating the team's technical capabilities
  • Ensuring System Health: Support the existing code base, troubleshoot complex production issues, and collaborate across teams to ensure end-to-end service reliability
What we offer
What we offer
  • A great Team and culture
  • An exciting career as an integral part of a world-leading software company
  • An attractive salary and benefits package
  • A commitment to inclusion, belonging and colleague wellbeing
  • Training and professional development opportunities (certifications programs, conferences etc.)
  • Additional annual leave days and extra paid days for different occasions (marriage, moving day, bereavement leave etc.)
  • Health insurance package and accidents insurance 24/7
  • Referral program with bonuses
  • Extra paid day for volunteering in the organization of your choice
  • Ability to work from office or hybrid from home
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Data Infrastructure & AI

Fullstory Anywhere is one of Fullstory's three primary product verticals, and it...
Location
Location
United States , Atlanta
Salary
Salary:
160000.00 - 170000.00 USD / Year
fullstory.com Logo
Fullstory
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Significant experience building and operating high-throughput data pipelines (batch and/or streaming) in a major cloud platform, including work with cloud data warehouses like BigQuery, Snowflake, or Databricks.
  • Proficiency in Go, Python, Java or a similar language.
  • Hands-on experience with data transformation tooling such as dbt, with a strong understanding of data modeling and pipeline observability.
  • Familiarity with LLM integration patterns and evaluation approaches (e.g., LangSmith, Vertex AI, or comparable frameworks), or demonstrated ability to ramp quickly in applied AI.
  • A track record of owning major system areas end-to-end: driving architectural decisions, maintaining production health, and improving reliability over time.
Job Responsibility
Job Responsibility
  • Maintain, extend, and scale Go microservices that transform and deliver Fullstory session data into customer warehouses and power the team's MCP server that enables AI agent integrations.
  • Develop and maintain dbt models and pipeline orchestration to ensure timely, fault-tolerant data migrations across hundreds of customer destinations.
  • Define evaluation frameworks for LLM outputs using tools like Langsmith and Vertex AI, ensuring AI-powered customer agents produce accurate, useful results.
  • Investigate and resolve production incidents across the data pipeline, implementing systemic fixes that prevent entire classes of failure from recurring.
  • Write technical design documents that drive consensus on architectural changes, proactively surfacing scaling bottlenecks, edge cases, and cross-team dependencies.
  • Demonstrate sound technical judgment by de-risking work through spikes, taking on tech debt deliberately, and knowing when to escalate versus dig in.
What we offer
What we offer
  • Flexibility and Connection
  • flexible PTO policy
  • annual company-wide closure
  • Benefits
  • paid parental leave
  • Bereavement leave, including miscarriage/pregnancy loss
  • Learning opportunities
  • annual learning subsidy
  • Productivity support
  • monthly productivity stipend
  • Fulltime
Read More
Arrow Right

Software Engineer, Data Infrastructure - Research

The Workload team is responsible for designing and running OpenAI’s LLM training...
Location
Location
United States , San Francisco
Salary
Salary:
250000.00 - 380000.00 USD / Year
openai.com Logo
OpenAI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong engineering fundamentals with experience in distributed systems, data pipelines, or infrastructure
  • Experience building APIs, modular code, and scalable abstractions
  • Comfortable debugging bottlenecks across large fleets of machines
  • Pride in building infrastructure that 'just works'
  • Collaborative, humble, and excited to own a foundational part of the ML stack
Job Responsibility
Job Responsibility
  • Design and implement the dataset infrastructure that powers OpenAI’s next-generation training stack
  • Design and maintain standardized dataset APIs, including for multimodal (MM) data that cannot fit in memory
  • Build proactive testing and scale validation pipelines for dataset loading at GPU scale
  • Collaborate with teammates to integrate datasets seamlessly into training and inference pipelines
  • Document and maintain dataset interfaces so they are discoverable, consistent, and easy for other teams to adopt
  • Establish safeguards and validation systems to ensure datasets remain reproducible and unchanged once standardized
  • Debug and resolve performance bottlenecks in distributed dataset loading
  • Provide visualization and inspection tools to surface errors, bugs, or bottlenecks in datasets
What we offer
What we offer
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Fulltime
Read More
Arrow Right

Staff Software Engineer, Data Infrastructure

At Docker, we make app development easier so developers can focus on what matter...
Location
Location
United States , Seattle
Salary
Salary:
195400.00 - 275550.00 USD / Year
docker.com Logo
Docker
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of software engineering experience with 3+ years focused on data engineering and analytics systems
  • Expert-level experience with Snowflake including advanced SQL, performance optimization, and cost management
  • Deep proficiency in DBT for data modeling, transformation, and testing with experience in large-scale implementations
  • Strong expertise with Apache Airflow for complex workflow orchestration and pipeline management
  • Hands-on experience with Sigma or similar modern BI platforms for self-service analytics
  • Extensive AWS experience including data services (S3, Redshift, EMR, Glue, Lambda, Kinesis) and infrastructure management
  • Proficiency in Python, SQL, and other programming languages commonly used in data engineering
  • Experience with infrastructure-as-code, CI/CD practices, and modern DevOps tools
  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience
  • Proven track record designing and implementing large-scale distributed data systems
Job Responsibility
Job Responsibility
  • Define and drive the technical strategy for Docker's data platform architecture, establishing long-term vision for scalable data systems
  • Lead design and implementation of highly scalable data infrastructure leveraging Snowflake, AWS, Airflow, DBT, and Sigma
  • Architect end-to-end data pipelines supporting real-time and batch analytics across Docker's product ecosystem
  • Drive technical decision-making around data platform technologies, architectural patterns, and engineering best practices
  • Establish technical standards for data quality, testing, monitoring, and operational excellence
  • Design and build robust, scalable data systems that process petabytes of data and support millions of user interactions
  • Implement complex data transformations and modeling using DBT for analytics and business intelligence use cases
  • Develop and maintain sophisticated data orchestration workflows using Apache Airflow
  • Optimize Snowflake performance and cost efficiency while ensuring reliability and scalability
  • Build data APIs and services that enable self-service analytics and integration with downstream systems
What we offer
What we offer
  • Freedom & flexibility
  • fit your work around your life
  • Designated quarterly Whaleness Days plus end of year Whaleness break
  • Home office setup
  • we want you comfortable while you work
  • 16 weeks of paid Parental leave
  • Technology stipend equivalent to $100 net/month
  • PTO plan that encourages you to take time to do the things you enjoy
  • Training stipend for conferences, courses and classes
  • Equity
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Data Infrastructure

LMArena is seeking a Software Engineer to join our team and build the data pipel...
Location
Location
United States , Bay Area
Salary
Salary:
Not provided
arena.ai Logo
Arena Intelligence, Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in software engineering, with a dedicated focus on data engineering and big data technologies
  • Proficiency in SQL and at least one programming language commonly used for data analysis (Python (preferred), Scala, R)
  • Hands-on experience with data processing and pipeline frameworks (Apache Spark, Ray Data, etc.) and at least one popular big data analytics platform (Databricks, Snowflake)
  • Demonstrated experience in designing, implementing, optimizing, and debugging production data pipelines
Job Responsibility
Job Responsibility
  • Design and build robust data pipelines to ingest, process, and transform user vote data to features essential for model performance evaluation
  • Collaborate with researchers and product leadership to understand product goals and necessary data
  • Design and implement solutions to generate result dashboards and reports, providing useful information for the public, model providers, and researchers
  • Ensure the integrity, data quality, and reliability of the pipelines
  • Scale our data infrastructure to accommodate increasing data volumes and evolving analytical needs
What we offer
What we offer
  • Comprehensive health and wellness benefits, including medical, dental, vision, and additional support programs
  • The opportunity to work on cutting-edge AI with a small, mission-driven team
  • A culture that values transparency, trust, and community impact
  • Fulltime
Read More
Arrow Right

Software Engineer - Data Infrastructure

We’re looking for an experienced software engineer to help shape the foundation ...
Location
Location
United States , New York City
Salary
Salary:
135000.00 - 280000.00 USD / Year
assembled.com Logo
Assembled
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience working with modern data warehouses (e.g., Snowflake, BigQuery) and understand their performance characteristics
  • Have built or maintained end-to-end ELT pipelines and are comfortable choosing the right level of precomputation
  • Have designed or worked closely with a metrics or semantic layer, and understand how to define metrics that are consistent, queryable, and performant across reporting surfaces
  • Are comfortable reasoning about systems tradeoffs—latency, cost, developer velocity, and reliability
  • Take pride in building systems that are clear, maintainable, and empower others
  • Have strong SQL fluency and are comfortable reading query plans, debugging slow queries, and optimizing for performance
Job Responsibility
Job Responsibility
  • Design and build systems that power both the storage and retrieval of analytical data
  • Own the transformation layer that models data for fast, consistent metric queries
  • Define and maintain the metrics layer that supports dashboards, exports, APIs, and internal tools
  • Collaborate with product, infrastructure, and Assist teams to build rich reporting experiences—like helping customers measure ROI on AI adoption
  • Manage scalable pipelines that move and transform production data for analysis
  • Instrument observability into the data platform, including freshness, lineage, and correctness
What we offer
What we offer
  • Generous medical, dental, and vision benefits
  • Paid company holidays, sick time, and unlimited time off
  • Monthly credits to spend on each: professional development, general wellness, Assembled customers, and commuting
  • Paid parental leave
  • Hybrid work model with catered lunches everyday (M-F), snacks, and beverages in our SF & NY offices
  • 401(k) plan enrollment
  • Stock options
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Data Infrastructure

We build the data and machine learning infrastructure to enable Plaid engineers ...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience
  • Extensive hands-on software engineering experience, with a strong track record of delivering successful projects within the Data Infrastructure or Platform domain at similar or larger companies
  • Deep understanding of one of: ML Infrastructure systems, including Feature Stores, Training Infrastructure, Serving Infrastructure, and Model Monitoring OR Data Infrastructure systems, including Data Warehouses, Data Lakehouses, Apache Spark, Streaming Infrastructure, Workflow Orchestration
  • Strong cross-functional collaboration, communication, and project management skills, with proven ability to coordinate effectively
  • Proficiency in coding, testing, and system design, ensuring reliable and scalable solutions
  • Demonstrated leadership abilities, including experience mentoring and guiding junior engineers
Job Responsibility
Job Responsibility
  • Contribute towards the long-term technical roadmap for data-driven and machine learning iteration at Plaid
  • Leading key data infrastructure projects such as improving ML development golden paths, implementing offline streaming solutions for data freshness, building net new ETL pipeline infrastructure, and evolving data warehouse or data lakehouse capabilities
  • Working with stakeholders in other teams and functions to define technical roadmaps for key backend systems and abstractions across Plaid
  • Debugging, troubleshooting, and reducing operational burden for our Data Platform
  • Growing the team via mentorship and leadership, reviewing technical documents and code changes
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • equity and/or commission
  • Fulltime
Read More
Arrow Right