CrawlJobs Logo

Software Engineer, Big Data Infrastructure

benchling.com Logo

Benchling

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

141104.00 - 190906.00 USD / Year

Job Description:

Benchling's mission is to unlock the power of biotechnology. The world's most innovative biotech companies use Benchling's R&D Cloud to power the development of breakthrough products and accelerate time to milestone and market. Benchling's customers generate a rich and variety of science data. To keep up our innovation, Benchling need a highly scalable and extensible data platform that can serve both its customers and internal application team. As one of Benchling’s Data Platform engineers, you’ll join a rapidly growing, premier engineering team and form the foundation of our data pillar, encompassing customer-facing data products, internal analytics, and the customer-facing data warehouse. The Big Data Infrastructure team is responsible for enabling customers access to their Benchling data for analytics & AI. You will build the next generation of Data Platform services that enables ingress and egress data access so that Benchling can seamlessly integrate with customer data lakes. Benchling is growing really quickly, and you’ll be setting the bar for high quality data and a metrics-driven culture as we scale. You’ll serve as a key input and thought leader, and work closely with the product teams to deliver data driven capabilities to our internal and external customers.

Job Responsibility:

  • Build next generation Data Platform with scalable data ingress/egress for internal and external customers
  • Define and design data transformations and pipelines for cross-functional datasets, while ensuring that data integrity and data privacy are first-class concerns regarded proactively, instead of reactively
  • Define the right Service Level Objectives for the batch & streaming pipelines, and optimize their performance
  • Designing and creating CI/CD pipelines for platform provisioning, full lifecycle management. Building the platform control panel to operate the fleet of systems efficiently
  • Work closely with the team across Application and Platform to establish best practices around usage of our data platform

Requirements:

  • Have 2+ years of experience or a proven track record in software engineering
  • Strong experience in backend engineering and distributed systems
  • Strong experience with scripting language (such as Python)
  • Experience with deployment and configuration management frameworks such as Terraform, Ansible, or Chef and container management systems such as Kubernetes or Amazon ECS
  • Driven by creating positive impact for our customers and Benchling's business, and ultimately accelerating the pace of research in the Life Sciences
  • Comfortable with complexity in the short term but can build towards simplicity in the long term
  • Strong communicator with both words and data - you understand what it takes to go from raw data to something a human understands
  • Willing to work onsite in our SF office 3 days a week

Nice to have:

Experience with data analytics and warehouse solutions (e.g. Snowflake, Delta Lake), data processing technologies (e.g. Kafka, Spark), schema design, and SQL are a plus!

What we offer:
  • Competitive total rewards package
  • Broad range of medical, dental, and vision plans for employees and their dependents
  • Fertility healthcare and family-forming benefits
  • Four months of fully paid parental leave
  • 401(k) + Employer Match
  • Commuter benefits for in-office employees and a generous home office set up stipend for remote employees
  • Mental health benefits, including therapy and coaching, for employees and their dependents
  • Monthly Wellness stipend
  • Learning and development stipend
  • Generous and flexible vacation
  • Company-wide Winter holiday shutdown
  • Sabbaticals for 5-year and 10-year anniversaries

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Engineer, Big Data Infrastructure

Senior Principal Data Platform Software Engineer

We’re looking for a Sr Principal Data Platform Software Engineer (P70) to be a k...
Location
Location
Salary
Salary:
239400.00 - 312550.00 USD / Year
https://www.atlassian.com Logo
Atlassian
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years in Data Engineering, Software Engineering, or related roles, with substantial exposure to big data ecosystems
  • Demonstrated experience building and operating data platforms or large‑scale data services in production
  • Proven track record of building services from the ground up (requirements → design → implementation → deployment → ongoing ownership)
  • Hands‑on experience with AWS, GCP (e.g., compute, storage, data, and streaming services) and cloud‑native architectures
  • Practical experience with big data technologies, such as Databricks, Apache Spark, AWS EMR, Apache Flink, or StarRocks
  • Strong programming skills in one or more of: Kotlin, Scala, Java, Python
  • Experience leading cross‑team technical initiatives and influencing senior stakeholders
  • Experience mentoring Staff/Principal engineers and lifting the technical bar for a team or org
  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience
Job Responsibility
Job Responsibility
  • Design, develop and own delivery of high quality big data and analytical platform solutions aiming to solve Atlassian’s needs to support millions of users with optimal cost, minimal latency and maximum reliability
  • Improve and operate large‑scale distributed data systems in the cloud (primarily AWS, with increasing integration with GCP and Kubernetes‑based microservices)
  • Drive the evolution of our high-performance analytical databases and its integrations with products, cloud infrastructures (AWS and GCP) and isolated cloud environments
  • Help define and uplift engineering and operational standards for petabyte scale data platforms, with sub‑second analytic queries and multi‑region availability (coding guidelines, code review practices, observability, incident response, SLIs/SLOs)
  • Partner across multiple product and platform teams (including Analytics, Marketplace/Ecosystem, Core Data Platform, ML Platform, Search, and Oasis/FedRAMP) to deliver company‑wide initiatives that depend on reliable, high‑quality data
  • Act as a technical mentor and multiplier, raising the bar on design quality, code quality, and operational excellence across the broader team
  • Design and implement self‑healing, resilient data platforms with strong observability, fault tolerance, and recovery characteristics
  • Own the long‑term architecture and technical direction of Atlassian’s product data platform with projects that are directly tied to Atlassian’s company-level OKRs
  • Be accountable for the reliability, cost efficiency, and strategic direction of Atlassian’s product analytical data platform
  • Partner with executives and influence senior leaders to align engineering efforts with Atlassian’s long-term business objectives
What we offer
What we offer
  • health and wellbeing resources
  • paid volunteer days
  • Fulltime
Read More
Arrow Right

Data Engineering Lead

Data Engineering Lead a strategic professional who stays abreast of developments...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10-15 years of hands-on experience in Hadoop, Scala, Java, Spark, Hive, Kafka, Impala, Unix Scripting and other Big data frameworks
  • 4+ years of experience with relational SQL and NoSQL databases: Oracle, MongoDB, HBase
  • Strong proficiency in Python and Spark Java with knowledge of core spark concepts (RDDs, Dataframes, Spark Streaming, etc) and Scala and SQL
  • Data Integration, Migration & Large Scale ETL experience (Common ETL platforms such as PySpark/DataStage/AbInitio etc.) - ETL design & build, handling, reconciliation and normalization
  • Data Modeling experience (OLAP, OLTP, Logical/Physical Modeling, Normalization, knowledge on performance tuning)
  • Experienced in working with large and multiple datasets and data warehouses
  • Experience building and optimizing ‘big data’ data pipelines, architectures, and datasets
  • Strong analytic skills and experience working with unstructured datasets
  • Ability to effectively use complex analytical, interpretive, and problem-solving techniques
  • Experience with Confluent Kafka, Redhat JBPM, CI/CD build pipelines and toolchain – Git, BitBucket, Jira
Job Responsibility
Job Responsibility
  • Strategic Leadership: Define and execute the data engineering roadmap for Global Wealth Data, aligning with overall business objectives and technology strategy
  • Team Management: Lead, mentor, and develop a high-performing, globally distributed team of data engineers, fostering a culture of collaboration, innovation, and continuous improvement
  • Architecture and Design: Oversee the design and implementation of robust and scalable data pipelines, data warehouses, and data lakes, ensuring data quality, integrity, and availability for global wealth data
  • Technology Selection and Implementation: Evaluate and select appropriate technologies and tools for data engineering, staying abreast of industry best practices and emerging trends specific to wealth management data
  • Performance Optimization: Continuously monitor and optimize data pipelines and infrastructure for performance, scalability, and cost-effectiveness, ensuring optimal access to global wealth data
  • Collaboration: Partner with business stakeholders, data scientists, portfolio managers, and other technology teams to understand data needs and deliver effective solutions that support investment strategies and client reporting
  • Data Governance: Implement and enforce data governance policies and procedures to ensure data quality, security, and compliance with relevant regulations, particularly around sensitive financial data
  • Fulltime
Read More
Arrow Right

Data Engineering Lead

The Engineering Lead Analyst is a senior level position responsible for leading ...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10-15 years of hands-on experience in Hadoop, Scala, Java, Spark, Hive, Kafka, Impala, Unix Scripting and other Big data frameworks
  • 4+ years of experience with relational SQL and NoSQL databases: Oracle, MongoDB, HBase
  • Strong proficiency in Python and Spark Java with knowledge of core spark concepts (RDDs, Dataframes, Spark Streaming, etc) and Scala and SQL
  • Data Integration, Migration & Large Scale ETL experience (Common ETL platforms such as PySpark/DataStage/AbInitio etc.) - ETL design & build, handling, reconciliation and normalization
  • Data Modeling experience (OLAP, OLTP, Logical/Physical Modeling, Normalization, knowledge on performance tuning)
  • Experienced in working with large and multiple datasets and data warehouses
  • Experience building and optimizing ‘big data’ data pipelines, architectures, and datasets
  • Strong analytic skills and experience working with unstructured datasets
  • Ability to effectively use complex analytical, interpretive, and problem-solving techniques
  • Experience with Confluent Kafka, Redhat JBPM, CI/CD build pipelines and toolchain – Git, BitBucket, Jira
Job Responsibility
Job Responsibility
  • Define and execute the data engineering roadmap for Global Wealth Data, aligning with overall business objectives and technology strategy
  • Lead, mentor, and develop a high-performing, globally distributed team of data engineers, fostering a culture of collaboration, innovation, and continuous improvement
  • Oversee the design and implementation of robust and scalable data pipelines, data warehouses, and data lakes, ensuring data quality, integrity, and availability for global wealth data
  • Evaluate and select appropriate technologies and tools for data engineering, staying abreast of industry best practices and emerging trends specific to wealth management data
  • Continuously monitor and optimize data pipelines and infrastructure for performance, scalability, and cost-effectiveness
  • Partner with business stakeholders, data scientists, portfolio managers, and other technology teams to understand data needs and deliver effective solutions
  • Implement and enforce data governance policies and procedures to ensure data quality, security, and compliance with relevant regulations
What we offer
What we offer
  • Equal opportunity employer commitment
  • Accessibility and accommodation support
  • Global workforce benefits
  • Fulltime
Read More
Arrow Right

Data Engineer

As a Data Engineer you will work with product owners, data scientists, business ...
Location
Location
United States , Woodcliff Lake, New Jersey
Salary
Salary:
Not provided
techstargroup.com Logo
Techstar Consulting
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Mathematics, or Engineering or the equivalent of 4 years of related professional IT experience
  • 3+ years of enterprise software engineering experience with object-oriented design, coding and testing patterns, as well as experience in engineering (commercial or open source) software platforms and large-scale data infrastructure solutions
  • 3+ years of software engineering and architecture experience within a cloud environment (Azure, AWS)
  • 3+ years of enterprise data engineering experience within any “Big Data” environment (preferred)
  • 3+ years of software development experience using Python
  • 2+ years of experience working in an Agile environment (Scrum, Lean or Kanban)
  • 3+ years of experience working in large-scale data integration and analytics projects, including using cloud (e.g., AWS Redshift, S3, EC2, Glue, Kinesis, EMR) and data-orchestration (e.g., Oozie, Apache Airflow) technologies
  • 3+ years of experience in implementing distributed data processing pipelines using Apache Spark
  • 3+ years of experience in designing relational/NoSQL databases and data warehouse solutions
  • 2+ years of experience in writing and optimizing SQL queries in a business environment with large-scale, complex datasets
Job Responsibility
Job Responsibility
  • Implements and enhances complex data processing pipelines with a focus on collecting, parsing, cleaning, managing, and analyzing large data sets that produce valuable business insights and discoveries
  • Determines the required infrastructure, services, and software required to build advanced data ingestion & transformation pipelines and solutions in the cloud
  • Assists data scientists and data analysts with data preparation, exploration, and analysis activities
  • Applies problem solving experience and knowledge of advanced algorithms to build high performance, parallel, and distributed solutions
  • Performs code and solution review activities and recommends enhancements that improve efficiency, performance, stability, and decreased support costs
  • Applies the latest DevOps and Agile methodologies to improve delivery time
  • Works with SCRUM teams in daily stand-up, providing progress updates on a frequent basis
  • Supports application, including incident and problem management
  • Performs debugging and triage of incident or problem and deployment of fix to restore services
  • Documents requirements and configurations and clarifies ambiguous specs
Read More
Arrow Right

Senior Software Engineer

This role is hybrid work. Come work on fantastically high-scale systems with us!...
Location
Location
India , Mumbai
Salary
Salary:
Not provided
blis.com Logo
Blis
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • C++, including Boost
  • Networking topics from asynchronous connection handling to TCP/IP parameters
  • Concurrency
  • RESTful APIs and web-serving concepts
  • Big Data structures and high-frequency data processing algorithms at scale
  • Relational and non-relational databases and concepts
  • Server-side Linux use and administration
  • Cloud infrastructure concepts and utilisation
  • Engineering design principles and when to go fast and when to go slow
  • 5+ years experience as a systems engineer or architect for complex, high-performance systems
Job Responsibility
Job Responsibility
  • Innovate, implement, support, and iterate on our real-time application systems, infrastructure, and code
  • Write and improve high-performance, highly efficient, and highly maintainable C++
  • Ensure our designs and systems are highly available, resilient, and secure
  • Support and mentor other members of the team
  • Commitment to Blis' Inclusion initiatives & 5 step sustainability plan
What we offer
What we offer
  • Comprehensive private healthcare
  • Matched pension scheme
  • Paid time off and one extra day off for your birthday
  • Enhanced paternity and maternity leave
  • Career coaching and development paths
  • Hybrid working
  • Fulltime
Read More
Arrow Right

Data Engineer

We are seeking our first Data Engineer, someone who can refine our data infrastr...
Location
Location
United States , New York City; San Francisco
Salary
Salary:
190000.00 - 250000.00 USD / Year
hebbia.ai Logo
Hebbia
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Data Science, Statistics, or a related field
  • 5+ years software development experience at a venture-backed startup or top technology firm, with a focus on data engineering
  • Significant hands-on experience in data engineering (ETL development, data warehousing, data lake management, etc.)
  • Adept at identifying and owning data projects end to end, with the ability to work independently and exercise sound judgment
  • Proficient in Python and SQL
  • comfortable working with cloud-based data stack tools
  • Familiar with big data processing frameworks (e.g., Spark, Hadoop) and data integration technologies (e.g., Airflow, DBT, or similar)
  • Experience implementing data governance, security, and compliance measures
  • Strong collaboration and communication skills, with the ability to translate business requirements into technical solutions
  • You are comfortable working in-person 5 days a week
Job Responsibility
Job Responsibility
  • Architect, build, and maintain ETL pipelines and workflows that ensure high data quality and reliability
  • Design and manage a central data lake to consolidate data from various sources, enabling advanced analytics and reporting
  • Collaborate with cross-functional stakeholders (product, engineering, and business) to identify data gaps and develop effective solutions
  • Implement best practices in data security and governance to ensure compliance and trustworthiness
  • Evaluate and integrate new technologies, tools, and approaches to optimize data processes and architectures
  • Continuously monitor, troubleshoot, and improve data pipelines and infrastructure for performance, scalability, and cost-efficiency
What we offer
What we offer
  • PTO: Unlimited
  • Insurance: Medical + Dental + Vision + 401K + Wellness Benefits
  • Eats: Catered lunch daily + doordash dinner credit if you ever need to stay late
  • Parental leave policy: 3 months non-birthing parent, 4 months for birthing parent
  • Fertility benefits: $15k lifetime benefit
  • New hire equity grant: competitive equity package with unmatched upside potential
  • Fulltime
Read More
Arrow Right

Software Engineer Consultant

HIKE2 has an exciting opportunity for a Software Engineer Consultant. We are see...
Location
Location
United States , Chicago
Salary
Salary:
75000.00 - 140000.00 USD / Year
hike2.com Logo
Hike2
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree
  • 1-4 years of experience building and deploying software solutions
  • Relevant experience in Software Engineering, Data Science, or Data Engineering
  • Experience with SCRUM and SDLC
  • Familiarity with BIG cloud platforms and architecture, such as AWS/Azure
  • Strong experience working with APIs
  • 2-3 experience in python and cloud infrastructure
  • 2-3 years experience setting up or optimizing computer resources
  • Light data modeling, statistical modeling, or data monitoring experience
  • Proficiency in various AI components and solutions
Job Responsibility
Job Responsibility
  • Develop and build software that integrates to a variety of technologies to automate decisions and actions
  • Develop and integrate with APIs
  • Ability to shape and influence the technical design and architecture
  • Design, develop, and maintain scalable and efficient solutions that combine custom code with integration
  • Collaborate with cross-functional teams to understand business requirements and implement solutions that meet business needs
  • Optimize and troubleshoot software for performance and reliability
  • Stay updated on industry trends and best practices in software engineering
  • Work closely with design, strategy, data scientists, analysts, and other stakeholders to understand the technical solution and the role of technology
What we offer
What we offer
  • Six national health medical plans to choose from, including a HSA option
  • Dental & Vision options
  • Retirement Savings with a Safe Harbor 401K plan with immediate vesting and company match
  • Long and short term disability coverage options
  • Life Insurance and travel insurance
  • Flexible PTO policy and 10 paid holidays
  • Reimbursement for certifications related to your role
  • Opportunity for career development, advancement and learning
  • Fulltime
Read More
Arrow Right

Data Engineer

Barbaricum is seeking a Data Engineer to provide support an emerging capability ...
Location
Location
United States , Omaha
Salary
Salary:
Not provided
barbaricum.com Logo
Barbaricum
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Active DoD Top Secret/SCI clearance required
  • 8+ years of demonstrated experience in software engineering
  • Bachelor’s degree in computer science or a related field
  • 8+ years of experience working with AWS big data technologies (S3, EC2) and demonstrate experience in distributed data processing, Data Modeling, ETL Development, and/or Data Warehousing
  • Demonstrated mid-level knowledge of software engineering best practices across the development lifecycle
  • 3+ years of experience using analytical concepts and statistical techniques
  • 8+ years of demonstrated experience across Mathematics, Applied Mathematics, Statistics, Applied Statistics, Machine Learning, Data Science, Operations Research, or Computer Science especially around software engineering and/or designing/implementing machine learning, data mining, advanced analytical algorithms, programming, data science, advanced statistical analysis, artificial intelligence
Job Responsibility
Job Responsibility
  • Design, implement, and operate data management systems for intelligence needs
  • Use Python to automate data workflows
  • Design algorithms databases, and pipelines to access, and optimize data retrieval, storage, use, integration and management by different data regimes and digital systems
  • Work with data users to determine, create, and populate optimal data architectures, structures, and systems
  • and plan, design, and optimize data throughput and query performance
  • Participate in the selection of backend database technologies (e.g. SQL, NoSQL, etc.), its configuration and utilization, and the optimization of the full data pipeline infrastructure to support the actual content, volume, ETL, and periodicity of data to support the intended kinds of queries and analysis to match expected responsiveness
  • Assist and advise the Government with developing, constructing, and maintaining data architectures
  • Research, study, and present technical information, in the form of briefings or written papers, on relevant data engineering methodologies and technologies of interest to or as requested by the Government
  • Align data architecture, acquisition, and processes with intelligence and analytic requirements
  • Prepare data for predictive and prescriptive modeling deploying analytics programs, machine learning and statistical methods to find hidden patterns, discover tasks and processes which can be automated and make recommendations to streamline data processes and visualizations
Read More
Arrow Right