CrawlJobs Logo

Software Engineer, Data Infrastructure

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

185000.00 - 385000.00 USD / Year

Job Description:

Data Platform at OpenAI owns the foundational data stack powering critical product, research, and analytics workflows. We operate some of the largest Spark compute fleets in production; design, and build data lakes and metadata systems on Iceberg and Delta with a vision toward exabyte-scale architecture; run high throughput streaming platforms on Kafka and Flink; provide orchestration with Airflow; and support ML feature engineering tooling such as Chronon. Our mission is to deliver reliable, secure, and efficient data access at scale and accelerate intelligent, AI assisted data workflows. Join us to build and operate these core platforms that underpin OpenAI products, research, and analytics. We’re not just scaling infrastructure – we’re redefining how people interact with data. Our vision includes intelligent interfaces and AI-assisted workflows that make working with data faster, more reliable, and more intuitive.

Job Responsibility:

  • Design, build, and maintain data infrastructure systems such as distributed compute, data orchestration, distributed storage, streaming infrastructure, machine learning infrastructure while ensuring scalability, reliability, and security
  • Ensure our data platform can scale by orders of magnitude while remaining reliable and efficient
  • Accelerate company productivity by empowering your fellow engineers & teammates with excellent data tooling and systems
  • Collaborate with product, research and analytics teams to build the technical foundations capabilities that unlock new features and experiences
  • Own the reliability of the systems you build, including participation in an on-call rotation for critical incidents
  • Take full lifecycle ownership: architecture, implementation, production operations, and on-call participation
  • Scale and harden big data compute and storage platforms
  • Build and support high-throughput streaming systems
  • Build and operate low latency data ingestions
  • Enable secure and governed data access for ML and analytics
  • Design for reliability and performance at extreme scale

Requirements:

  • 4+ years in data infrastructure engineering OR 4+ years in infrastructure engineering with a strong interest in data
  • Take pride in building and operating scalable, reliable, secure systems
  • Comfortable with ambiguity and rapid change
  • Intrinsic desire to learn and fill in missing skills
  • Strong talent for sharing learnings clearly and concisely with others
  • Supported Spark, Kafka, Flink, Airflow, Trino, or Iceberg as platforms
  • Well-versed in infrastructure tooling like Terraform
  • Experienced in debugging large-scale distributed systems
  • Excited about solving data infrastructure problems in the AI space
What we offer:
  • Offers Equity
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends, may also be provided
  • Performance-related bonus(es) for eligible employees

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Engineer, Data Infrastructure

Software Engineer, Data Infrastructure

The Data Infrastructure team at Figma builds and operates the foundational platf...
Location
Location
United States , San Francisco; New York
Salary
Salary:
149000.00 - 350000.00 USD / Year
figma.com Logo
Figma
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of Software Engineering experience, specifically in backend or infrastructure engineering
  • Experience designing and building distributed data infrastructure at scale
  • Strong expertise in batch and streaming data processing technologies such as Spark, Flink, Kafka, or Airflow/Dagster
  • A proven track record of impact-driven problem-solving in a fast-paced environment
  • A strong sense of engineering excellence, with a focus on high-quality, reliable, and performant systems
  • Excellent technical communication skills, with experience working across both technical and non-technical counterparts
  • Experience mentoring and supporting engineers, fostering a culture of learning and technical excellence
Job Responsibility
Job Responsibility
  • Design and build large-scale distributed data systems that power analytics, AI/ML, and business intelligence
  • Develop batch and streaming solutions to ensure data is reliable, efficient, and scalable across the company
  • Manage data ingestion, movement, and processing through core platforms like Snowflake, our ML Datalake, and real-time streaming systems
  • Improve data reliability, consistency, and performance, ensuring high-quality data for engineering, research, and business stakeholders
  • Collaborate with AI researchers, data scientists, product engineers, and business teams to understand data needs and build scalable solutions
  • Drive technical decisions and best practices for data ingestion, orchestration, processing, and storage
What we offer
What we offer
  • equity
  • health, dental & vision
  • retirement with company contribution
  • parental leave & reproductive or family planning support
  • mental health & wellness benefits
  • generous PTO
  • company recharge days
  • a learning & development stipend
  • a work from home stipend
  • cell phone reimbursement
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

This young, early-stage start-up challenger are currently looking for a hands-on...
Location
Location
United States , New York or DC
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Startup Energy: You thrive in fast-paced environments, manage ambiguity well, and focus on what moves the needle
  • Designing and deploying intuitive, user-friendly APIs
  • Demonstrated ability to train and deploy models at scale
  • Successfully launching machine learning services, particularly those leveraging LLMs, embeddings, and inference, into production environments
  • Handling and securing large-scale production data
  • Demonstrated proficiency in Python, Go, or C
  • A proactive approach to tackling complex challenges in a fast-paced, early-stage environment
  • A passion for innovation and a collaborative spirit
Job Responsibility
Job Responsibility
  • Developing secure data sharing middleware
  • Integrating software seamlessly into the workflows of specialized professionals, ensuring secure and efficient data access throughout the asset recruitment process
  • Building, shipping and supporting mission critical services in support of the services that make up the Data platform
  • Providing solutions for the full data stack – from the data management, software development and model and deployment lifecycles
What we offer
What we offer
  • Competitive Salary + Equity
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

Data Infrastructure Engineer – New York or DC (hybrid) – Competitive Salary + Eq...
Location
Location
United States , New York or DC
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Startup Energy: You thrive in fast-paced environments, manage ambiguity well, and focus on what moves the needle
  • Designing and deploying intuitive, user-friendly APIs
  • Demonstrated ability to train and deploy models at scale
  • Successfully launching machine learning services, particularly those leveraging LLMs, embeddings, and inference, into production environments
  • Handling and securing large-scale production data
  • Demonstrated proficiency in Python, Go, or C
  • A proactive approach to tackling complex challenges in a fast-paced, early-stage environment
  • A passion for innovation and a collaborative spirit
Job Responsibility
Job Responsibility
  • Joining as part of the founding Engineering team, you will be a key part of developing secure data sharing middleware
  • Their software will integrate seamlessly into the workflows of specialized professionals, ensuring secure and efficient data access throughout the asset recruitment process
  • The data infrastructure engineer requires a mix of software development and ML Ops practices, resulting in an exciting, fast paced engineering role
  • You will be able to demonstrate experience building, shipping and supporting mission critical services in support of the services that make up the Data platform
  • This role requires the ability to provide solutions for the full data stack – from the data management, software development and model and deployment lifecycles
What we offer
What we offer
  • Competitive Salary + Equity
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

This young, early-stage start-up challenger are currently looking for a hands-on...
Location
Location
United States , New York or DC
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Startup Energy: You thrive in fast-paced environments, manage ambiguity well, and focus on what moves the needle
  • Designing and deploying intuitive, user-friendly APIs
  • Demonstrated ability to train and deploy models at scale
  • Successfully launching machine learning services, particularly those leveraging LLMs, embeddings, and inference, into production environments
  • Handling and securing large-scale production data
  • Demonstrated proficiency in Python, Go, or C
  • A proactive approach to tackling complex challenges in a fast-paced, early-stage environment
  • A passion for innovation and a collaborative spirit
Job Responsibility
Job Responsibility
  • Developing secure data sharing middleware
  • Integrating software seamlessly into the workflows of specialized professionals, ensuring secure and efficient data access throughout the asset recruitment process
  • Providing solutions for the full data stack – from the data management, software development and model and deployment lifecycles
What we offer
What we offer
  • Equity
  • Fulltime
Read More
Arrow Right

Data Infrastructure Engineer

This young, early-stage start-up challenger is currently looking for a hands-on ...
Location
Location
United States , New York or DC
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Startup Energy: You thrive in fast-paced environments, manage ambiguity well, and focus on what moves the needle
  • Designing and deploying intuitive, user-friendly APIs
  • Demonstrated ability to train and deploy models at scale
  • Successfully launching machine learning services, particularly those leveraging LLMs, embeddings, and inference, into production environments
  • Handling and securing large-scale production data
  • Demonstrated proficiency in Python, Go, or C
  • A proactive approach to tackling complex challenges in a fast-paced, early-stage environment
  • A passion for innovation and a collaborative spirit
Job Responsibility
Job Responsibility
  • Developing secure data sharing middleware
  • Integrating software seamlessly into the workflows of specialised professionals, ensuring secure and efficient data access throughout the asset recruitment process
  • Providing solutions for the full data stack – from the data management, software development and model and deployment lifecycles
What we offer
What we offer
  • Equity
  • Opportunity to work with an Ambitious, Rapidly-Growing Start-Up
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - Data Infrastructure

We build the data and machine learning infrastructure to enable Plaid engineers ...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of software engineering experience
  • Extensive hands-on software engineering experience, with a strong track record of delivering successful projects within the Data Infrastructure or Platform domain at similar or larger companies
  • Deep understanding of one of: ML Infrastructure systems, including Feature Stores, Training Infrastructure, Serving Infrastructure, and Model Monitoring OR Data Infrastructure systems, including Data Warehouses, Data Lakehouses, Apache Spark, Streaming Infrastructure, Workflow Orchestration
  • Strong cross-functional collaboration, communication, and project management skills, with proven ability to coordinate effectively
  • Proficiency in coding, testing, and system design, ensuring reliable and scalable solutions
  • Demonstrated leadership abilities, including experience mentoring and guiding junior engineers
Job Responsibility
Job Responsibility
  • Contribute towards the long-term technical roadmap for data-driven and machine learning iteration at Plaid
  • Leading key data infrastructure projects such as improving ML development golden paths, implementing offline streaming solutions for data freshness, building net new ETL pipeline infrastructure, and evolving data warehouse or data lakehouse capabilities
  • Working with stakeholders in other teams and functions to define technical roadmaps for key backend systems and abstractions across Plaid
  • Debugging, troubleshooting, and reducing operational burden for our Data Platform
  • Growing the team via mentorship and leadership, reviewing technical documents and code changes
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • equity and/or commission
  • Fulltime
Read More
Arrow Right

Senior Software Engineer - ML Infrastructure

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of industry experience as a software engineer, with strong focus on ML/AI infrastructure or large-scale distributed systems
  • Hands-on expertise in building and operating ML platforms (e.g., feature stores, data pipelines, training/inference frameworks)
  • Proven experience delivering reliable and scalable infrastructure in production
  • Solid understanding of ML Ops concepts and tooling, as well as best practices for observability, security, and reliability
  • Strong communication skills and ability to collaborate across teams
Job Responsibility
Job Responsibility
  • Design and implement large-scale ML infrastructure, including feature stores, pipelines, deployment tooling, and inference systems
  • Drive the rollout of Plaid’s next-generation feature store to improve reliability and velocity of model development
  • Help define and evangelize an ML Ops “golden path” for secure, scalable model training, deployment, and monitoring
  • Ensure operational excellence of ML pipelines and services, including reliability, scalability, performance, and cost efficiency
  • Collaborate with ML product teams to understand requirements and deliver solutions that accelerate experimentation and iteration
  • Contribute to technical strategy and architecture discussions within the team
  • Mentor and support other engineers through code reviews, design discussions, and technical guidance
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Core Data

As a Senior Software Engineer on our Core Data team, you will take a leading rol...
Location
Location
United States
Salary
Salary:
190000.00 - 220000.00 USD / Year
pomelocare.com Logo
Pomelo Care
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience building high-quality, scalable data systems and pipelines
  • Expert-level proficiency in SQL and Python, with a deep understanding of data modeling and transformation best practices
  • Hands-on experience with dbt for data transformation and Dagster, Beam, Dataflow or similar tools for pipeline orchestration
  • Experience with modern data stack tools and cloud platforms, with a strong understanding of data warehouse design principles
  • A track record of delivering elegant and maintainable solutions to complex data problems that drive real business impact
Job Responsibility
Job Responsibility
  • Build and maintain elegant data pipelines that orchestrate ingestion from diverse sources and normalize data for company-wide consumption
  • Lead the design and development of robust, scalable data infrastructure that enables our clinical and product teams to make data-driven decisions, using dbt, Dagster, Beam and Dataflow
  • Write clean, performant SQL and Python to transform raw data into actionable insights that power our platform
  • Architect data models and transformations that support both operational analytics and new data-driven product features
  • Mentor other engineers, providing technical guidance on data engineering best practices and thoughtful code reviews, fostering a culture of data excellence
  • Collaborate with product, clinical and analytics teams to understand data needs and ensure we are building infrastructure that unlocks the most impactful insights
  • Optimize data processing workflows for performance, reliability and cost-effectiveness
What we offer
What we offer
  • Competitive healthcare benefits
  • Generous equity compensation
  • Unlimited vacation
  • Membership in the First Round Network (a curated and confidential community with events, guides, thousands of Q&A questions, and opportunities for 1-1 mentorship)
  • Fulltime
Read More
Arrow Right