CrawlJobs Logo

Data Pipeline Engineer

India, Pune City · Job Posted March 05, 2026
Apply Position
Job Link Share

Job Description

We are looking for a Data Pipeline Engineer to design, build, and operate scalable, reliable data pipelines for enterprise Data platforms. The candidate must have strong working knowledge, and this is a hands-on individual contributor role.

Job Responsibility

  • Build and maintain data transformation pipelines using Dbt/Spark
  • Develop and optimize large-scale/CPU intensive data processing using Apache Spark/Dremio
  • Orchestrate workflows using Airflow and/or Dagster
  • Implement data quality checks, testing, and monitoring for pipelines
  • Support schema evolution, backfills, and incremental processing
  • Ensure pipelines meet SLAs for freshness, reliability, and performance
  • Expertise/working knowledge in Dremio (semantic layer, virtual datasets, Reflections)

Requirements

  • Strong hands-on experience with dbt
  • Strong hands-on experience with Apache Spark
  • Experience with Dremio/Trino or similar lakehouse query engines
  • Experience with Airflow and/or Dagster
  • Understanding of data catalogs and lineage (e.g., OpenLineage, DataHub, Apache Polaris, openlineage)
  • Proficiency in Python
  • Experience with Git-based development and CI/CD

Nice to have

  • OpenTable format/Iceberg, Apache Arrow
  • CDC-based analytics pipelines
  • Cloud platforms (AWS)
  • Kubernetes-based data platforms

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Data Pipeline Engineer

8 matching positions

Aws Data Engineer (Cloud Data Platform & Pipeline Specialist)

Design, develop, and maintain scalable cloud-based data pipelines using AWS serv...
Location
Location
United States , Atlanta
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in data engineering, with strong hands-on expertise in AWS data services (Glue, EMR, S3, RDS, DataSync, DMS)
  • 5+ years of Proven experience building and managing data pipelines (batch and streaming) in cloud environments
  • 5+ years of Strong experience in data migration, transformation frameworks, and large-scale data replication
  • 5+ years of Deep understanding of data modeling, data transformation, and reconciliation techniques
  • 5+ years of Experience designing and implementing secure data access and governance (least privilege principles)
  • 5+ years of Hands-on experience with data validation, auditing, and reconciliation processes
  • Familiarity with regulatory or finance data environments and reporting workloads
  • 5+ years of Strong problem-solving skills and ability to work in a collaborative, fast-paced environment
  • AWS data services
  • data pipelines
Job Responsibility
Job Responsibility
  • Design, develop, and maintain scalable cloud-based data pipelines using AWS services such as Glue, EMR, S3, RDS, DataSync, and DMS
  • Build and optimize batch and streaming data orchestration workflows to support enterprise data platforms
  • Lead large-scale data migration efforts, including legacy-to-cloud transformations and replication strategies
  • Perform data modeling, transformation, and reconciliation to ensure high-quality, consistent datasets across systems
  • Implement secure data access patterns following least-privilege principles for pipelines and datasets
  • Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver solutions
  • Establish robust data validation, reconciliation, and audit mechanisms to meet regulatory and reporting requirements
  • Troubleshoot and optimize performance of ETL/ELT pipelines and data workflows in AWS environments
  • Support governance, compliance, and audit readiness for data platforms in regulated environments (finance/reporting)
  • Fulltime
Read More
Arrow Right

Senior AWS Data Engineer / Data Platform Engineer

We are seeking a highly experienced Senior AWS Data Engineer to design, build, a...
Location
Location
United Arab Emirates , Dubai
Salary
Salary:
Not provided
northbaysolutions.com Logo
NorthBay
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in data engineering and data platform development
  • Strong hands-on experience with: AWS Glue
  • Amazon EMR (Spark)
  • AWS Lambda
  • Apache Airflow (MWAA)
  • Amazon EC2
  • Amazon CloudWatch
  • Amazon Redshift
  • Amazon DynamoDB
  • AWS DataZone
Job Responsibility
Job Responsibility
  • Design, develop, and optimize scalable data pipelines using AWS native services
  • Lead the implementation of batch and near-real-time data processing solutions
  • Architect and manage data ingestion, transformation, and storage layers
  • Build and maintain ETL/ELT workflows using AWS Glue and Apache Spark on EMR
  • Orchestrate complex data workflows using Apache Airflow (MWAA)
  • Develop and manage serverless data processing using AWS Lambda
  • Design and optimize data warehouses using Amazon Redshift
  • Implement and manage NoSQL data models using Amazon DynamoDB
  • Utilize AWS DataZone for data governance, cataloging, and access management
  • Monitor, log, and troubleshoot data pipelines using Amazon CloudWatch
  • Fulltime
Read More
Arrow Right

Backend Engineer - Data Pipeline

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
Germany , Berlin
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 5 years of software development experience
  • At least 2 years of experience developing and operating Rust-based systems in production
  • Experience with AWS/ GCP cloud
Job Responsibility
Job Responsibility
  • Design and develop backend services using Rust along with other exciting technologies, all deployed in AWS on Kubernetes
  • Fulltime
Read More
Arrow Right

Backend Engineer - Data Pipeline

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
Israel , Ramat Gan
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • At least 5 years of software development experience
  • At least 2 years of experience developing and operating Rust-based systems in production-MUST
  • Experience with AWS/ GCP cloud
  • Experience with Kafka - Advantage
  • Experience using ClickHouse in production- an advantage
Job Responsibility
Job Responsibility
  • Design and develop our backend services using Rust along with other exciting technologies, all deployed in AWS on Kubernetes
  • Fulltime
Read More
Arrow Right

Senior AI Data Pipeline Engineer

Shape the Future of Intelligence as our next Senior AI Data Pipeline Engineer! A...
Location
Location
United States , Lake Oswego
Salary
Salary:
105600.00 - 145200.00 USD / Year
trimble.com Logo
Trimble Inc.
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience in data engineering or a related field
  • Strong hands-on experience managing and optimizing Databricks
  • Experience building and maintaining streaming pipelines with Kafka
  • Experience implementing Change Data Capture (CDC) using Debezium connectors
  • Practical experience deploying and operating services in Kubernetes
  • Strong proficiency in Python and/or Scala
  • Experience with SQL and distributed data processing frameworks (e.g., Spark)
  • Familiarity with cloud platforms (AWS, Azure, or GCP)
  • Experience with infrastructure-as-code tools (Terraform, etc.)
  • Strong understanding of distributed systems concepts
Job Responsibility
Job Responsibility
  • Design, build, and optimize scalable batch and real-time data pipelines
  • Manage and administer Databricks workspaces, clusters, jobs, and performance tuning
  • Develop and maintain streaming architectures using Kafka
  • Implement and manage Change Data Capture (CDC) pipelines using Debezium connectors
  • Deploy, monitor, and manage containerized workloads using Kubernetes
  • Implement CI/CD practices for data engineering workflows
  • Ensure data quality, observability, governance, and security best practices
  • Collaborate with data scientists, ML engineers, and software engineers to deliver production-grade data solutions
  • Support and optimize AI/ML data pipelines and model deployment workflows
  • Troubleshoot production issues and implement performance improvements
What we offer
What we offer
  • Medical
  • Dental
  • Vision
  • Life
  • Disability
  • Time off plans
  • retirement plans
  • tax savings plans for health, dependent care and commuter expenses
  • Paid Parental Leave
  • Employee Stock Purchase Plan
  • Fulltime
Read More
Arrow Right

Data Pipeline Engineer +Airflow

We are looking for a Data Pipeline Engineer to design, build, and operate scalab...
Location
Location
India , Pune City
Salary
Salary:
Not provided
votredircom.fr Logo
Wissen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong hands-on experience with dbt
  • Strong hands-on experience with Apache Spark
  • Experience with Dremio/Trino or similar lakehouse query engines
  • Experience with Airflow and/or Dagster
  • Understanding of data catalogs and lineage (e.g., OpenLineage, DataHub, Apache Polaris, openlineage)
  • Proficiency in Python
  • Experience with Git-based development and CI/CD
Job Responsibility
Job Responsibility
  • Build and maintain data transformation pipelines using Dbt/Spark
  • Develop and optimize large-scale/CPU intensive data processing using Apache Spark/Dremio
  • Orchestrate workflows using Airflow and/or Dagster
  • Implement data quality checks, testing, and monitoring for pipelines
  • Support schema evolution, backfills, and incremental processing
  • Ensure pipelines meet SLAs for freshness, reliability, and performance
  • Expertise/working knowledge in Dremio (semantic layer, virtual datasets, Reflections)
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Data Pipeline

Fullstory’s mission is to help teams create amazing online experiences for their...
Location
Location
United States , Atlanta
Salary
Salary:
160000.00 - 180000.00 USD / Year
fullstory.com Logo
Fullstory
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong understanding of the nuances of distributed architectures and tackling capacity and performance challenges when dealing with data at scale
  • Experience writing Golang code in production
  • Experience with Kubernetes and supporting highly available and reliable cloud based microservices in production
  • Worked on asynchronous or streaming ingestion and processing systems and frameworks
  • Leverage AI tools to enhance work quality by implementing AI solutions that optimize efficiency
Job Responsibility
Job Responsibility
  • Engineer distributed systems that operate at tens to hundreds of thousands of requests per second using Go, Kubernetes, and GCP
  • Explore ideas about how to unlock new features through thoughtful architecture and framework designs
  • Ensure the quality and reliability of Fullstory's capture and extraction systems across many services and downstream applications both internally and externally
  • Collaborate with technical leaders and product experts to evolve the technical roadmap for Ingestion services, and to participate in collaborative development efforts across the Engineering organization
What we offer
What we offer
  • Flexibility and Connection with vibrant HQ in Atlanta and a tight-knit group in London
  • Flexible PTO policy and an annual company-wide closure, along with federal holidays
  • Sponsored benefit packages for US-based Fullstorians, and supplemental coverage options for international Fullstorians
  • Professional development opportunities through training programs and an annual learning subsidy for US and EMEA-based employees
  • Monthly productivity stipend for US and EMEA-based Fullstorians
  • Team Collaboration through team off-sites and an annual full-company meet-up
  • Paid parental leave
  • Bereavement leave, including miscarriage/pregnancy loss
  • Fulltime
Read More
Arrow Right

Backend Engineer - Data Pipeline

Coralogix is a modern, full-stack observability platform transforming how busine...
Location
Location
Israel , Ramat Gan
Salary
Salary:
Not provided
coralogix.com Logo
Coralogix
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of development experience with Scala or another JVM language
  • Extensive hands-on experience with scalable and distributed systems architecture and design
  • 4+ years of hands-on experience with Data Streaming technologies, including Apache Kafka, Spark Streaming, KafkaStreams, or Apache Flink
  • Proficiency in data modeling and designing systems to handle large-scale, distributed datasets efficiently
  • Experience with containerization and orchestration tools, including Kubernetes and Docker containers
  • Strong knowledge of distributed computing paradigms and principles, such as consistency, partitioning, and resilience
  • B.Sc. in Computer Science or an equivalent field
Job Responsibility
Job Responsibility
  • End-to-end development and ownership of our products and features, from design to scalable and predictable production behavior
  • Solve diverse complex problems of high scale
  • Collaborate with other engineers and product managers in order to improve our products
  • Review code, architecture, and data to identify and troubleshoot problems
  • Fulltime
Read More
Arrow Right