CrawlJobs Logo

Data Pipeline Engineer +Airflow

India, Pune City · Job Posted March 20, 2026
Apply Position
Job Link Share

Job Description

We are looking for a Data Pipeline Engineer to design, build, and operate scalable, reliable data pipelines for enterprise Data platforms. The candidate must have strong working knowledge, and this is a hands-on individual contributor role.

Job Responsibility

  • Build and maintain data transformation pipelines using Dbt/Spark
  • Develop and optimize large-scale/CPU intensive data processing using Apache Spark/Dremio
  • Orchestrate workflows using Airflow and/or Dagster
  • Implement data quality checks, testing, and monitoring for pipelines
  • Support schema evolution, backfills, and incremental processing
  • Ensure pipelines meet SLAs for freshness, reliability, and performance
  • Expertise/working knowledge in Dremio (semantic layer, virtual datasets, Reflections)

Requirements

  • Strong hands-on experience with dbt
  • Strong hands-on experience with Apache Spark
  • Experience with Dremio/Trino or similar lakehouse query engines
  • Experience with Airflow and/or Dagster
  • Understanding of data catalogs and lineage (e.g., OpenLineage, DataHub, Apache Polaris, openlineage)
  • Proficiency in Python
  • Experience with Git-based development and CI/CD

Nice to have

  • OpenTable format/Iceberg, Apache Arrow
  • CDC-based analytics pipelines
  • Cloud platforms (AWS)
  • Kubernetes-based data platforms

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Data Pipeline Engineer +Airflow

8 matching positions

Senior AWS Data Engineer / Data Platform Engineer

We are seeking a highly experienced Senior AWS Data Engineer to design, build, a...
Location
Location
United Arab Emirates , Dubai
Salary
Salary:
Not provided
northbaysolutions.com Logo
NorthBay
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in data engineering and data platform development
  • Strong hands-on experience with: AWS Glue
  • Amazon EMR (Spark)
  • AWS Lambda
  • Apache Airflow (MWAA)
  • Amazon EC2
  • Amazon CloudWatch
  • Amazon Redshift
  • Amazon DynamoDB
  • AWS DataZone
Job Responsibility
Job Responsibility
  • Design, develop, and optimize scalable data pipelines using AWS native services
  • Lead the implementation of batch and near-real-time data processing solutions
  • Architect and manage data ingestion, transformation, and storage layers
  • Build and maintain ETL/ELT workflows using AWS Glue and Apache Spark on EMR
  • Orchestrate complex data workflows using Apache Airflow (MWAA)
  • Develop and manage serverless data processing using AWS Lambda
  • Design and optimize data warehouses using Amazon Redshift
  • Implement and manage NoSQL data models using Amazon DynamoDB
  • Utilize AWS DataZone for data governance, cataloging, and access management
  • Monitor, log, and troubleshoot data pipelines using Amazon CloudWatch
  • Fulltime
Read More
Arrow Right

Data Pipeline Engineer

We are looking for a Data Pipeline Engineer to design, build, and operate scalab...
Location
Location
India , Pune City
Salary
Salary:
Not provided
votredircom.fr Logo
Wissen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong hands-on experience with dbt
  • Strong hands-on experience with Apache Spark
  • Experience with Dremio/Trino or similar lakehouse query engines
  • Experience with Airflow and/or Dagster
  • Understanding of data catalogs and lineage (e.g., OpenLineage, DataHub, Apache Polaris, openlineage)
  • Proficiency in Python
  • Experience with Git-based development and CI/CD
Job Responsibility
Job Responsibility
  • Build and maintain data transformation pipelines using Dbt/Spark
  • Develop and optimize large-scale/CPU intensive data processing using Apache Spark/Dremio
  • Orchestrate workflows using Airflow and/or Dagster
  • Implement data quality checks, testing, and monitoring for pipelines
  • Support schema evolution, backfills, and incremental processing
  • Ensure pipelines meet SLAs for freshness, reliability, and performance
  • Expertise/working knowledge in Dremio (semantic layer, virtual datasets, Reflections)
  • Fulltime
Read More
Arrow Right

Senior Data Engineer / Data Analyst

N-iX is a global software development service company. Our customer is the Europ...
Location
Location
Ukraine
Salary
Salary:
Not provided
n-ix.com Logo
N-iX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of experience in analytics, sales operations, revenue insights, or data-driven consulting
  • Strong SQL skills (joins, window functions, performance tuning)
  • Production experience with Apache Airflow
  • Python skills for pipeline development
  • Ability to build and maintain database views (PostgreSQL, Snowflake, Redshift)
  • Solid ETL/ELT understanding
  • Clear communication with non-technical stakeholders
  • GitHub experience (branches, PRs, reviews)
  • English level - at least Upper-Intermediate, both spoken and written
Job Responsibility
Job Responsibility
  • Work with Salesforce objects and relationships to ensure correct ingestion, transformation, and integration into our reporting environment
  • Design, build, and maintain data pipelines using Apache Airflow (DAG creation, scheduling, monitoring, troubleshooting)
  • Write, optimize, and maintain SQL queries, stored procedures, and functions for data transformation and extraction
  • Create and manage database views to support analytics, reporting, and downstream applications
  • Ensure data quality, consistency, and reliability across pipelines and views (validation checks, monitoring)
  • Support QuickSuite dataset adjustments (new fields, logic changes, view extensions)
  • Document data flows, data models, and pipeline logic for long-term maintainability and handover
What we offer
What we offer
  • Flexible working format - remote, office-based or flexible
  • A competitive salary and good compensation package
  • Personalized career growth
  • Professional development tools (mentorship program, tech talks and trainings, centers of excellence, and more)
  • Active tech communities with regular knowledge sharing
  • Education reimbursement
  • Memorable anniversary presents
  • Corporate events and team buildings
  • Other location-specific benefits
Read More
Arrow Right
New

Data Engineer (Business Data)

As a Data Engineer on the R&D Team, you will help FreshBooks build and evolve hi...
Location
Location
Canada , Toronto
Salary
Salary:
102400.00 - 128000.00 CAD / Year
freshbooks.com Logo
FreshBooks
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years of experience working in data engineering, analytics engineering, or a related field
  • Experience building and maintaining data models and transformation pipelines (e.g., dbt or similar tools)
  • Strong SQL skills and proficiency in Python (or similar language)
  • Solid understanding of data modeling concepts (e.g., dimensional modeling, normalization, data warehousing patterns)
  • Experience working with a cloud data warehouse (e.g., BigQuery, Snowflake, Redshift)
  • Familiarity with orchestrators such as Airflow, GCC, Dagster, Prefect (or similar tools)
  • Basic understanding or exposure to streaming/event-driven systems (e.g., Pub/Sub, Kafka, Kinesis, Dataflow)
  • Understanding of data quality, testing, and validation practices
  • Ability to work cross-functionally and communicate clearly with both technical and non-technical stakeholders
Job Responsibility
Job Responsibility
  • Architect, design, and develop clean, high-performance datasets using modern tools like dbt and BigQuery, focusing on usability and scalability for analytical consumption
  • Be a key contributor to our domain-oriented data architecture, defining how core business entities (e.g., customers, payments) are modeled, governed, and exposed across the organization
  • Build and maintain robust batch and streaming data pipelines that transform raw data into trusted, analytics-ready assets to support both near real-time and traditional use cases
  • Collaborate closely with Analytics, Product, and Machine Learning teams to translate complex requirements into reusable, well-governed data models and contracts
  • Champion data quality, reliability, and documentation by implementing rigorous testing, validation, and monitoring practices
  • Leverage cutting-edge tools, including AI/agentic workflows, to accelerate development, enhance productivity, and improve data exploration and lineage
  • Participate in code reviews, contribute to improving engineering standards, and partner with platform teams to ensure our data solutions meet ambitious performance, cost, and scalability goals
What we offer
What we offer
  • Comprehensive health and wellness benefits
  • Generous time off including a flexible vacation plan
  • Retirement savings program or pension plan matched to your local office
  • Stock options for every full-time employee
  • Parental leave and new parent support
  • Annual healthy living credit
  • Comprehensive medical and dental benefits
  • Fertility and gender-affirming benefits
  • Peer Recognition Program
  • Employee Assistance Program
  • Fulltime
Read More
Arrow Right

Data Engineer (Modern Data Stack / DataOps)

We are looking for a Senior Data Engineer with experience in modern cloud data p...
Location
Location
Romania
Salary
Salary:
Not provided
ddroidd.com Logo
ddroidd
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in Data Engineering or similar roles
  • Strong hands-on experience with Snowflake
  • Strong hands-on experience with Apache Airflow
  • Strong hands-on experience with dbt (data build tool)
  • Strong SQL expertise
  • Experience building and managing ELT pipelines
  • Experience with Git and collaborative development workflows
  • Familiarity with DataOps / CI/CD practices for data pipelines
  • Solid understanding of data modeling and data warehouse architecture
Job Responsibility
Job Responsibility
  • Design and maintain data pipelines and ELT workflows within a modern cloud data platform
  • Build and maintain data transformation models using dbt, including testing, documentation, and modular data modeling
  • Orchestrate and monitor workflows using Apache Airflow
  • Manage and optimize Snowflake data warehouse environments, including performance and cost efficiency
  • Implement DataOps practices such as CI/CD, automated testing, and deployment for data pipelines
  • Ensure data quality, reliability, and observability across the data platform
  • Collaborate with analytics, product, and engineering teams to deliver reliable datasets and data products
  • Improve monitoring, automation, and operational processes for the data platform
What we offer
What we offer
  • Private medical insurance
  • National holidays off, even when falling on weekends
  • Loyalty leave: +1 day/year
  • Continuous professional development opportunities
  • Sports subscription programs
  • Referral bonuses for bringing in new talent
  • Meal tickets
  • Bookster subscription for reading & learning
  • Community and team-building events
  • Flexible and unlimited remote work policy
  • Fulltime
Read More
Arrow Right

Software Engineer III (Data Engineer)

We are seeking a motivated and detail-oriented Data Engineer to join our data en...
Location
Location
United States , Irvine
Salary
Salary:
96500.00 - 138061.00 USD / Year
haeaus.com Logo
Hyundai AutoEver America
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Information Technology, or related field
  • 10+ years of experience in data warehouse and MDM applications
  • Extensive knowledge of SQL and relational databases
  • Strong programming skills in Python and PL/SQL
  • 10+ years of experience working with Informatica
  • Knowledge of Informatica IDMC and PowerCenter is a must
  • Knowledge of big data technologies such as Apache Spark, Hadoop, or equivalent cloud-based services
  • Strong understanding of data governance, security, and quality practices in cloud environments
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable data pipelines and ETL processes to transform and integrate data from multiple sources
  • Develop workflow automation using Apache Airflow and manage production batches with performance tuning
  • Create and optimize data models, schemas, and database structures for scalability and efficiency
  • Implement best practices in data warehouse and MDM applications, ensuring data governance, security, and quality
  • Collaborate with data scientists, analysts, and business stakeholders to translate requirements into technical solutions
  • Develop scripts and automation tools to streamline data processing and pipeline operations
  • Implement automated monitoring, alerting systems, and maintain documentation for pipelines and data models
  • Create reports and visualizations to communicate pipeline performance and data insights
What we offer
What we offer
  • comprehensive medical/dental coverage
  • generous PTO
  • education assistance
  • annual merit increase eligibility
  • Fulltime
Read More
Arrow Right

Data Engineer, Enterprise Data, Analytics and Innovation

Are you passionate about building robust data infrastructure and enabling innova...
Location
Location
United States
Salary
Salary:
110000.00 - 125000.00 USD / Year
vaniamgroup.com Logo
Vaniam Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience in data engineering, ETL, or related roles
  • Strong proficiency in Python and SQL for data engineering
  • Hands-on experience building and maintaining pipelines in a lakehouse or modern data platform
  • Practical understanding of Medallion architectures and layered data design
  • Familiarity with modern data stack tools, including: Spark or PySpark
  • Workflow orchestration (Airflow, dbt, or similar)
  • Testing and observability frameworks
  • Containers (Docker) and Git-based version control
  • Excellent communication skills, problem-solving mindset, and a collaborative approach
Job Responsibility
Job Responsibility
  • Design, build, and operate reliable ETL and ELT pipelines in Python and SQL
  • Manage ingestion into Bronze, standardization and quality in Silver, and curated serving in Gold layers of our Medallion architecture
  • Maintain ingestion from transactional MySQL systems into Vaniam Core to keep production data flows seamless
  • Implement observability, data quality checks, and lineage tracking to ensure trust in all downstream datasets
  • Develop schemas, tables, and views optimized for analytics, APIs, and product use cases
  • Apply and enforce best practices for security, privacy, compliance, and access control, ensuring data integrity across sensitive healthcare domains
  • Maintain clear and consistent documentation for datasets, pipelines, and operating procedures
  • Lead the integration of third-party datasets, client-provided sources, and new product-generated data into Vaniam Core
  • Partner with product and innovation teams to build repeatable processes for onboarding new data streams
  • Ensure harmonization, normalization, and governance across varied data types (scientific, engagement, operational)
What we offer
What we offer
  • 100% remote environment with opportunities for local meet-ups
  • Positive, diverse, and supportive culture
  • Passionate about serving clients focused on Cancer and Blood diseases
  • Investment in you with opportunities for professional growth and personal development through Vaniam Group University
  • Health benefits – medical, dental, vision
  • Generous parental leave benefit
  • Focused on your financial future with a 401(k) Plan and company match
  • Work-Life Balance and Flexibility
  • Flexible Time Off policy for rest and relaxation
  • Volunteer Time Off for community involvement
  • Fulltime
Read More
Arrow Right

Senior Solutions Engineer – Big Data & Data Infrastructure

This is a great opportunity to be part of one of the fastest-growing infrastruct...
Location
Location
Israel , Tel Aviv
Salary
Salary:
Not provided
vastdata.com Logo
VAST Data
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2–4 years in software / solution or infrastructure engineering
  • 2–4 years focused on building / maintaining large-scale data pipelines / storage & database solutions
  • Proficiency in Trino, Spark (Structured Streaming & batch) and solid working knowledge of Apache Kafka
  • Coding background in Python (must-have)
  • Deep understanding of data storage architectures including SQL, NoSQL, and HDFS
  • Solid grasp of DevOps practices, including containerization (Docker), orchestration (Kubernetes), and infrastructure provisioning (Terraform)
  • Experience with distributed systems, stream processing, and event-driven architecture
  • Hands-on familiarity with benchmarking and performance profiling for storage systems, databases, and analytics engines
  • Excellent communication skills
Job Responsibility
Job Responsibility
  • Build distributed data pipelines using technologies like Kafka, Spark (batch & streaming), Python, Trino, Airflow, and S3-compatible data lakes
  • Design, deploy, and troubleshoot hybrid cloud/on-prem environments using Terraform, Docker, Kubernetes, and CI/CD automation tools
  • Implement event-driven and serverless workflows
  • Create technical guides, architecture docs, and demo pipelines
  • Integrate data validation, observability tools, and governance directly into the pipeline lifecycle
  • Own end-to-end platform lifecycle: ingestion → transformation → storage (Parquet/ORC on S3) → compute layer (Trino/Spark)
  • Benchmark and tune storage backends (S3/NFS/SMB) and compute layers for throughput, latency, and scalability using production datasets
  • Work cross-functionally with R&D to push performance limits across interactive, streaming, and ML-ready analytics workloads
  • Operate and debug object store–backed data lake infrastructure
Read More
Arrow Right