CrawlJobs Logo

Lead Data Engineer

life-science-talent-solutions.dk Logo

Life Science Talent

Location Icon

Location:
Denmark , København og omegn

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We're building the expert intelligence layer for scientific research: a knowledge graph that connects the world to leading experts based on publications & clinical trials in precise ontologies. You'll design pipelines that ingest millions of life-science records, shaping a graph of how scientific knowledge is modelled, enriched, & served. This is true green-fields work. Your decisions will lay the data foundations for our entire expert intelligence platform.

Job Responsibility:

  • Own data end-to-end, design & run data pipelines turning millions of scientific records into a knowledge graph
  • Implement precision entity resolution & enrichment, disambiguate & enrich experts from noisy data sources
  • Utilise LLM workflows where it makes sense, for entity extraction, relationship inference & quality validation
  • Develop vector embeddings & semantic search capabilities to power expert discovery & similarity matching
  • Model life-science entities & relationships, ontologies, author networks, publication & clinical trial metadata
  • Build graph & vector data access, performant, accessible, reliable, observable & testable data access
  • Move fast & ship value incrementally, done-and-iterating beats perfect-and-pending
  • Radiate intent & document your thinking openly, collaborating async-first in a hybrid environment
  • Lead when you're the expert, follow when someone else is, challenging assumptions when necessary
  • Use AI as a daily force multiplier across coding, schema design, debugging, optimisation & validation

Requirements:

  • Graph Databases: Neo4j, ArangoDB, Neptune
  • schema design, relationship modelling, query optimisation
  • Python Data Engineering: ETL development
  • pandas/polars
  • distributed processing with Spark or Dask
  • Entity Resolution: Deduplication, merging, enrichment across heterogeneous scientific data sources
  • AI-Assisted Data Extraction: LLM entity extraction, schema generation & quality validation
  • Vector Search: Experience with Pinecone, FAISS, Qdrant, or Weaviate
  • embeddings, hybrid retrieval
  • Workflow Orchestration: Robust, observable pipelines using Airflow or Dagster
  • Data Formats & Standards: Parquet, JSONL, RDF/Turtle
  • selecting formats for graph & semantic use cases
  • Embedding Models: Understanding of HuggingFace/OpenAI models, dimensionality tradeoffs & cost
  • Ownership mindset: Treat data & schemas as products powering multiple domains
  • Strategic evaluation: Choose tech aligned with our scale, latency expectations, & roadmap needs
  • Process engineering: Build reliable, repeatable & maintainable workflows
  • Cross-functional communication: Bridge product engineers & scientific domain teams
  • Comfort with scientific data realities: Deep rabbit holes of sprawling complexity

Nice to have:

  • Life Sciences familiarity: Publication, clinical trial, institutional, ontologies (MeSH, SNOMED, Gene Ontology)
  • Hands-on with scientific datasets: OpenAlex, PubMed/MEDLINE, ORCID, Semantic Scholar, ClinicalTrials.gov

Additional Information:

Job Posted:
January 05, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Lead Data Engineer

Senior AWS Data Engineer / Data Platform Engineer

We are seeking a highly experienced Senior AWS Data Engineer to design, build, a...
Location
Location
United Arab Emirates , Dubai
Salary
Salary:
Not provided
northbaysolutions.com Logo
NorthBay
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in data engineering and data platform development
  • Strong hands-on experience with: AWS Glue
  • Amazon EMR (Spark)
  • AWS Lambda
  • Apache Airflow (MWAA)
  • Amazon EC2
  • Amazon CloudWatch
  • Amazon Redshift
  • Amazon DynamoDB
  • AWS DataZone
Job Responsibility
Job Responsibility
  • Design, develop, and optimize scalable data pipelines using AWS native services
  • Lead the implementation of batch and near-real-time data processing solutions
  • Architect and manage data ingestion, transformation, and storage layers
  • Build and maintain ETL/ELT workflows using AWS Glue and Apache Spark on EMR
  • Orchestrate complex data workflows using Apache Airflow (MWAA)
  • Develop and manage serverless data processing using AWS Lambda
  • Design and optimize data warehouses using Amazon Redshift
  • Implement and manage NoSQL data models using Amazon DynamoDB
  • Utilize AWS DataZone for data governance, cataloging, and access management
  • Monitor, log, and troubleshoot data pipelines using Amazon CloudWatch
  • Fulltime
Read More
Arrow Right

Lead Data Engineer

Location
Location
Uzbekistan , Tashkent
Salary
Salary:
Not provided
ventionteams.com Logo
Vention
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years in data engineering, including leading teams
  • Strong experience with Python and SQL
  • Solid knowledge of Apache Airflow, Kafka, Big Query, and AWS/Azure
  • Strong experience with ETL processes, data warehousing, and stream processing
  • Leadership skills with proven ability to mentor and grow engineering teams
  • Experience working in an Agile environment (Scrum, Kanban, etc.)
  • B2+ English, with experience communicating with English-speaking customers
Job Responsibility
Job Responsibility
  • Guide a team of data engineers in building and optimizing data pipelines
  • Oversee architecture for data ingestion, transformation, and storage with Big Query and SQL, ensuring high performance and reliability
  • Collaborate with product managers and clients to define data strategies and resolve complex technical challenges
  • Stay up-to-date with the latest cloud data technologies and industry best practices, bringing innovation to our data ecosystem
What we offer
What we offer
  • EDU corporate community (300+ members): tech communities, interest clubs, events, a small R&D lab, a knowledge base, and a dedicated AI track
  • Licenses for AI tools: GitHub Copilot, Cursor, and others
  • Expanded medical support for employees in Tashkent
  • 19 working days of vacation per year, 21 after two years in the company
  • Corporate getaway & teambuilding activities
  • Support for the significant events in your life
  • Referral bonuses for bringing in new talent
  • Fulltime
Read More
Arrow Right

Lead Data Engineer

Embark on an exciting journey into the realm of software product development wit...
Location
Location
India , Noida
Salary
Salary:
Not provided
3pillarglobal.com Logo
3Pillar Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrated expertise with a minimum of 8 years of relevant experience in data engineering with experience of leading and managing a technical support or data engineering team
  • Ability to function as a player-coach — leading while also contributing hands-on
  • Proficiency in Python for data engineering, ETL/ELT workflows, and automation tasks
  • Solid understanding of data governance, security, and performance optimization practices
  • Experience designing, creating and maintaining data pipelines in AWS environments, with strong exposure to AWS Data services like S3, Glue, Lambda, and Step Functions, and other related services
  • Data Architecture: Expert-level understanding of ETL, data warehouse design, and pipeline optimization
  • Database Expertise: Deep experience with Teradata, Change Data Capture (CDC) processes, and data synchronization
  • Integration: Experience with Lambda-based integrations, such as triggering SAS or RMJ events
  • Strong communication skills and ability to translate complex technical concepts for business stakeholders
  • Resilience: Ability to make critical, autonomous decisions and lead a team in a high-pressure environment
Job Responsibility
Job Responsibility
  • Team Management: Manage the on-site team of 3-4 Data Engineers
  • Roster & Scheduling: Create, manage, and maintain team rosters and schedules to ensure full 8-hour coverage, 7 days a week, including rotational weekend and holiday coverage
  • Availability: Must be able to work a day shift and manage a team that provides 7-day-a-week coverage, including weekends and holidays on rotation
  • Coverage Assurance: Handle all team leave requests (planned and unplanned). You are responsible for ensuring coverage is always maintained, especially during medical or other unexpected absences
  • Performance & SLAs: Act as the primary operational point of contact, ensuring the team executes all tasks efficiently, makes sound decisions, and adheres to all agreed-upon SLAs
  • Final Escalation Point: Serve as the final technical escalation point for the team
  • personally handle the most critical job failures and system issues
  • Expert RCA: Lead all high-priority Root Cause Analysis (RCA) efforts, documenting solutions and implementing permanent fixes to prevent recurrence
  • Strategic Improvement: Drive the team's efforts in performance tuning and AWS Glue cost optimization
  • Automation Strategy: Define and lead the development of new automation jobs (using Glue, Lambda, and Step Functions) to reduce manual support tasks
  • Fulltime
Read More
Arrow Right

Data Engineering Lead

Embark on an exciting journey into the realm of software product development wit...
Location
Location
India
Salary
Salary:
Not provided
3pillarglobal.com Logo
3Pillar Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in Data Engineering or related field, including 2+ years in a lead role
  • Expert-level proficiency with AWS data services (e.g., Glue, EMR, Lambda, Redshift, S3, Kinesis, Step Functions)
  • Strong Python skills for data processing, automation, and pipeline development
  • Experience building batch and streaming pipelines (Spark, PySpark, Kafka, Kinesis, etc.)
  • Strong SQL expertise and experience with relational and NoSQL databases
  • Hands-on experience with IaC (Terraform, CloudFormation, CDK)
  • Familiarity with DevOps tools for CI/CD (e.g., GitHub Actions, GitLab CI, Jenkins)
  • Understanding of data modeling, data warehousing concepts, and distributed systems
  • Fulltime
Read More
Arrow Right

Lead Data Engineer

We're looking for a Lead Data Engineer to build the data infrastructure that pow...
Location
Location
United States
Salary
Salary:
185000.00 - 225000.00 USD / Year
zora.co Logo
Zora
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in data engineering, with at least 2 years in a technical leadership role
  • Strong proficiency in Python and SQL for building production data pipelines, complex data transformations and evolving data platforms, shared infrastructure, and internal tooling with engineering best practices.
  • Strong experience in designing, building, and maintaining cloud-based data pipelines using orchestration tools such as Airflow, Dagster, Prefect, Temporal, or similar.
  • Hands-on experience with cloud data platforms (AWS, GCP, or Azure) and modern data stack tools
  • Deep understanding of data warehousing concepts and experience with platforms like Snowflake, BigQuery, Redshift, or similar
  • Strong software engineering fundamentals including testing, CI/CD, version control, and writing maintainable, documented code
  • Track record of optimizing data systems for performance, reliability, and cost efficiency at scale
  • Excellent communication skills and ability to collaborate with cross-functional teams including product, engineering, and design
Job Responsibility
Job Responsibility
  • Design and build scalable data pipelines to ingest, process, and transform blockchain data, trading events, user activity, and market signals at high volume and low latency
  • Architect and maintain data infrastructure that powers real-time trading analytics, P&L calculations, leaderboards, market cap tracking, and liquidity monitoring across the platform
  • Own ETL/ELT processes that transform raw onchain data from multiple blockchains into clean, reliable, and performant datasets used by product, engineering, analytics, and ML teams
  • Build and optimize data models and schemas that support both operational systems (serving live trading data) and analytical use cases (understanding market dynamics and user behavior)
  • Establish data quality frameworks including monitoring, alerting, testing, and validation to ensure pipeline reliability and data accuracy at scale
  • Collaborate with backend engineers to design event schemas, data contracts, and APIs that enable real-time data flow between systems
  • Partner with product and analytics teams to understand data needs and translate them into robust engineering solutions
  • Provide technical leadership by mentoring engineers, conducting code reviews, establishing best practices, and driving architectural decisions for the data platform
  • Optimize performance and costs of data infrastructure as we scale to handle exponentially growing trading volumes
What we offer
What we offer
  • Remote-First Culture: Work from anywhere in the world!
  • Competitive Compensation: Including salary, pre-IPO stock options, token compensation, and additional financial incentives
  • Comprehensive Benefits: Robust healthcare options, including fully covered medical, dental, and vision for employees
  • Retirement Contributions: Up to 4% employer match on your 401(k) contributions
  • Health & Wellness: Free memberships to One Medical, Teladoc, and Health Advocate
  • Unlimited Time Off: Flexible vacation policies, company holidays, and recharge weeks to prioritize wellness
  • Home Office Reimbursement: To cover home office items, monthly home internet, and monthly cell phone (if applicable)
  • Ease of Life Reimbursement: To cover everything from an Uber home in the rain, childcare, or meal delivery
  • Career Development: Access to mentorship, training, and opportunities to grow your career
  • Inclusive Environment: A culture dedicated to diversity, equity, inclusion, and belonging
  • Fulltime
Read More
Arrow Right

Lead Data Engineer

Sparteo is an independent suite of AI-powered advertising technologies built on ...
Location
Location
Salary
Salary:
Not provided
corporate.sparteo.com Logo
Sparteo
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in distributed data systems
  • Proficient in clustering, various table types, and data types
  • Strong understanding of materialized views concepts
  • Skilled in designing table sorting keys
  • Solid programming skills in Python, Java, or Scala
  • Expertise in database technologies (SQL, NoSQL)
  • You are comfortable using AI-assisted development tools (e.g., GitHub Copilot, Tabnine)
  • Proven experience leading data teams in fast-paced environments
  • Ability to mentor junior engineers and foster a culture of growth and collaboration
  • Data-driven decision-making abilities aligned with Sparteo's focus on results and improvement
Job Responsibility
Job Responsibility
  • Data Infrastructure Design and Optimization
  • Lead the design, implementation, and optimization of data architectures to support massive data pipelines
  • Ensure the scalability, security, and performance of the data infrastructure
  • Collaborate with software and data scientists to integrate AI-driven models into data workflows
  • Leadership and Team Management
  • Manage and mentor a team of 2 data engineers, fostering a culture of continuous improvement
  • Oversee project execution and delegate responsibilities within the team
  • Guide technical decisions and promote best practices in data engineering
  • Collaboration and Cross-Functional Engagement
  • Work closely with product managers, developers, and analytics teams to define data needs and ensure alignment with business objectives
What we offer
What we offer
  • A convivial and flexible working environment, with our telecommuting culture integrated into the company's organization
  • A friendly and small-sized team that you can find in our offices near Lille or in Paris
  • Social gatherings and company events organized throughout the year
  • Sparteo is experiencing significant growth both in terms of business and workforce, especially internationally
  • Additional benefits include an advantageous compensation system with non-taxable and non-mandatory overtime hours, as well as a Swile restaurant ticket card
  • Fulltime
Read More
Arrow Right

Lead Data Engineer

As a Lead Data Engineer at Rearc, you'll play a pivotal role in establishing and...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
rearc.io Logo
Rearc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in data engineering, data architecture, or related fields
  • Extensive experience in writing and testing Java and/or Python
  • Proven experience with data pipeline orchestration using platforms such as Airflow, Databricks, DBT or AWS Glue
  • Hands-on experience with data analysis tools and libraries like Pyspark, NumPy, Pandas, or Dask
  • Proficiency with Spark and Databricks is highly desirable
  • Proven track record of leading complex data engineering projects, including designing and implementing scalable data solutions
  • Hands-on experience with ETL processes, data warehousing, and data modeling tools
  • In-depth knowledge of data integration tools and best practices
  • Strong understanding of cloud-based data services and technologies (e.g., AWS Redshift, Azure Synapse Analytics, Google BigQuery)
  • Strong strategic and analytical skills
Job Responsibility
Job Responsibility
  • Understand Requirements and Challenges: Collaborate with stakeholders to deeply understand their data requirements and challenges
  • Implement with a DataOps Mindset: Embrace a DataOps mindset and utilize modern data engineering tools and frameworks, such as Apache Airflow, Apache Spark, or similar, to build scalable and efficient data pipelines and architectures
  • Lead Data Engineering Projects: Take the lead in managing and executing data engineering projects, providing technical guidance and oversight to ensure successful project delivery
  • Mentor Data Engineers: Share your extensive knowledge and experience in data engineering with junior team members, guiding and mentoring them to foster their growth and development in the field
  • Promote Knowledge Sharing: Contribute to our knowledge base by writing technical blogs and articles, promoting best practices in data engineering, and contributing to a culture of continuous learning and innovation
Read More
Arrow Right

Lead Data Engineer

As a Lead Data Engineer at Rearc, you'll play a pivotal role in establishing and...
Location
Location
United States
Salary
Salary:
Not provided
rearc.io Logo
Rearc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in data engineering, data architecture, or related technical fields
  • Proven ability to design, build, and optimize large-scale data ecosystems
  • Strong track record of leading complex data engineering initiatives
  • Deep hands-on expertise in ETL/ELT design, data warehousing, and data modeling
  • Extensive experience with data integration frameworks and best practices
  • Advanced knowledge of cloud-based data services and architectures (AWS Redshift, Azure Synapse Analytics, Google BigQuery, or equivalent)
  • Strong strategic and analytical thinking
  • Proficiency with modern data engineering frameworks (Databricks, Spark, lakehouse technologies like Delta Lake)
  • Exceptional communication and interpersonal skills
Job Responsibility
Job Responsibility
  • Engage deeply with stakeholders to understand data needs, business challenges, and technical constraints
  • Translate stakeholder needs into scalable, high-quality data solutions
  • Implement with a DataOps mindset using tools like Apache Airflow, Databricks/Spark, Kafka
  • Build reliable, automated, and efficient data pipelines and architectures
  • Lead and execute complex projects
  • Provide technical direction and set engineering standards
  • Ensure alignment with customer goals and company principles
  • Mentor and develop data engineers
  • Promote knowledge sharing and thought leadership
  • Contribute to internal and external content
What we offer
What we offer
  • Comprehensive health benefits
  • Generous time away and flexible PTO
  • Maternity and paternity leave
  • Access to educational resources with reimbursement for continued learning
  • 401(k) plan with company contribution
Read More
Arrow Right