CrawlJobs Logo

Data Pipeline Engineer +Airflow

votredircom.fr Logo

Wissen

Location Icon

Location:
India , Pune City

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are looking for a Data Pipeline Engineer to design, build, and operate scalable, reliable data pipelines for enterprise Data platforms. The candidate must have strong working knowledge, and this is a hands-on individual contributor role.

Job Responsibility:

  • Build and maintain data transformation pipelines using Dbt/Spark
  • Develop and optimize large-scale/CPU intensive data processing using Apache Spark/Dremio
  • Orchestrate workflows using Airflow and/or Dagster
  • Implement data quality checks, testing, and monitoring for pipelines
  • Support schema evolution, backfills, and incremental processing
  • Ensure pipelines meet SLAs for freshness, reliability, and performance
  • Expertise/working knowledge in Dremio (semantic layer, virtual datasets, Reflections)

Requirements:

  • Strong hands-on experience with dbt
  • Strong hands-on experience with Apache Spark
  • Experience with Dremio/Trino or similar lakehouse query engines
  • Experience with Airflow and/or Dagster
  • Understanding of data catalogs and lineage (e.g., OpenLineage, DataHub, Apache Polaris, openlineage)
  • Proficiency in Python
  • Experience with Git-based development and CI/CD

Nice to have:

  • OpenTable format/Iceberg, Apache Arrow
  • CDC-based analytics pipelines
  • Cloud platforms (AWS)
  • Kubernetes-based data platforms

Additional Information:

Job Posted:
March 20, 2026

Employment Type:
Fulltime
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Data Pipeline Engineer +Airflow

Senior AWS Data Engineer / Data Platform Engineer

We are seeking a highly experienced Senior AWS Data Engineer to design, build, a...
Location
Location
United Arab Emirates , Dubai
Salary
Salary:
Not provided
northbaysolutions.com Logo
NorthBay
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in data engineering and data platform development
  • Strong hands-on experience with: AWS Glue
  • Amazon EMR (Spark)
  • AWS Lambda
  • Apache Airflow (MWAA)
  • Amazon EC2
  • Amazon CloudWatch
  • Amazon Redshift
  • Amazon DynamoDB
  • AWS DataZone
Job Responsibility
Job Responsibility
  • Design, develop, and optimize scalable data pipelines using AWS native services
  • Lead the implementation of batch and near-real-time data processing solutions
  • Architect and manage data ingestion, transformation, and storage layers
  • Build and maintain ETL/ELT workflows using AWS Glue and Apache Spark on EMR
  • Orchestrate complex data workflows using Apache Airflow (MWAA)
  • Develop and manage serverless data processing using AWS Lambda
  • Design and optimize data warehouses using Amazon Redshift
  • Implement and manage NoSQL data models using Amazon DynamoDB
  • Utilize AWS DataZone for data governance, cataloging, and access management
  • Monitor, log, and troubleshoot data pipelines using Amazon CloudWatch
  • Fulltime
Read More
Arrow Right

Senior Software Engineer, Data Engineering

Join us in building the future of finance. Our mission is to democratize finance...
Location
Location
United States , Menlo Park
Salary
Salary:
146000.00 - 198000.00 USD / Year
robinhood.com Logo
Robinhood
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience building end-to-end data pipelines
  • Hands-on software engineering experience, with the ability to write production-level code in Python for user-facing applications, services, or systems (not just data scripting or automation)
  • Expert at building and maintaining large-scale data pipelines using open source frameworks (Spark, Flink, etc)
  • Strong SQL (Presto, Spark SQL, etc) skills
  • Experience solving problems across the data stack (Data Infrastructure, Analytics and Visualization platforms)
  • Expert collaborator with the ability to democratize data through actionable insights and solutions
Job Responsibility
Job Responsibility
  • Help define and build key datasets across all Robinhood product areas. Lead the evolution of these datasets as use cases grow
  • Build scalable data pipelines using Python, Spark and Airflow to move data from different applications into our data lake
  • Partner with upstream engineering teams to enhance data generation patterns
  • Partner with data consumers across Robinhood to understand consumption patterns and design intuitive data models
  • Ideate and contribute to shared data engineering tooling and standards
  • Define and promote data engineering best practices across the company
What we offer
What we offer
  • Market competitive and pay equity-focused compensation structure
  • 100% paid health insurance for employees with 90% coverage for dependents
  • Annual lifestyle wallet for personal wellness, learning and development, and more
  • Lifetime maximum benefit for family forming and fertility benefits
  • Dedicated mental health support for employees and eligible dependents
  • Generous time away including company holidays, paid time off, sick time, parental leave, and more
  • Lively office environment with catered meals, fully stocked kitchens, and geo-specific commuter benefits
  • Bonus opportunities
  • Equity
  • Fulltime
Read More
Arrow Right

Software Engineer, Data Engineering

Join us in building the future of finance. Our mission is to democratize finance...
Location
Location
Canada , Toronto
Salary
Salary:
124000.00 - 145000.00 CAD / Year
robinhood.com Logo
Robinhood
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of professional experience building end-to-end data pipelines
  • Hands-on software engineering experience, with the ability to write production-level code in Python for user-facing applications, services, or systems (not just data scripting or automation)
  • Expert at building and maintaining large-scale data pipelines using open source frameworks (Spark, Flink, etc)
  • Strong SQL (Presto, Spark SQL, etc) skills
  • Experience solving problems across the data stack (Data Infrastructure, Analytics and Visualization platforms)
  • Expert collaborator with the ability to democratize data through actionable insights and solutions
Job Responsibility
Job Responsibility
  • Help define and build key datasets across all Robinhood product areas. Lead the evolution of these datasets as use cases grow
  • Build scalable data pipelines using Python, Spark and Airflow to move data from different applications into our data lake
  • Partner with upstream engineering teams to enhance data generation patterns
  • Partner with data consumers across Robinhood to understand consumption patterns and design intuitive data models
  • Ideate and contribute to shared data engineering tooling and standards
  • Define and promote data engineering best practices across the company
What we offer
What we offer
  • bonus opportunities
  • equity
  • benefits
  • Fulltime
Read More
Arrow Right

Data Engineer

Data Engineer role at Airtable, the no-code app platform. The position involves ...
Location
Location
United States , San Francisco
Salary
Salary:
179500.00 - 221500.00 USD / Year
airtable.com Logo
Airtable
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience designing, creating and maintaining scalable data pipelines, preferably in Airflow
  • Proficiency in at least one programming language (preferably Python)
  • Highly effective with SQL and understand how to write and tune complex queries
  • Experience wrangling data and understanding complex data systems
  • Passionate and thoughtful about building systems that enhance human understanding
  • Communicate with clarity and precision in written form
  • Experience communicating with graphs and plots
Job Responsibility
Job Responsibility
  • Work between engineering organization and stakeholders to understand data needs and produce pipelines, data marts, and other data solutions
  • Design and update foundational business tables to simplify analysis across the entire company
  • Improve the performance and reliability of the data warehouse
  • Build and enforce a pattern language across the data stack to ensure data pipelines and tables are consistent, accurate, and well-understood
What we offer
What we offer
  • Benefits
  • Restricted stock units
  • Incentive compensation
  • Fulltime
Read More
Arrow Right

Data Engineer

Become a player in our data engineering team, grow on a personal level and help ...
Location
Location
Serbia , Novi Beograd
Salary
Salary:
Not provided
mdpi.com Logo
MDPI
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A university degree, ideally in Computer Science or related science, technology or engineering field
  • 2+ years of relevant work experience in data engineering roles
  • Experience in data acquisition, laking, warehousing, modeling, and orchestration
  • Proficiency in SQL (including window functions and CTE)
  • Proficiency in RDBMS (e.g., MySQL, PostgreSQL)
  • Strong programming skills in Python (with libraries like Polars, optionally Arrow / PyArrow API)
  • First exposure to OLAP query engines (e.g., Clickhouse, DuckDB, Apache Spark)
  • Familiarity with Apache Airflow (or similar tools like Dagster or Prefect)
  • Strong teamwork and communication skills
  • Ability to work independently and manage your time effectively
Job Responsibility
Job Responsibility
  • Assist in designing, building, and maintaining efficient data pipelines
  • Work on data modeling tasks to support the creation and maintenance of data warehouses
  • Integrate data from multiple sources, ensuring data consistency and reliability
  • Collaborate in implementing and managing data orchestration processes and tools
  • Help establish monitoring systems to maintain high standards of data quality and availability
  • Work closely with the Data Architect, Senior Data Engineers, and other members across the organization on various data infrastructure projects
  • Participate in the optimization of data processes, seeking opportunities to enhance system performance
What we offer
What we offer
  • Competitive salary and benefits package
Read More
Arrow Right

Senior Data Engineer

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 270000.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of dedicated data engineering experience, solving complex data pipelines issues at scale
  • Experience building data models and data pipelines on top of large datasets (in the order of 500TB to petabytes)
  • Value SQL as a flexible and extensible tool, and are comfortable with modern SQL data orchestration tools like DBT, Mode, and Airflow
  • Experience working with different performant warehouses and data lakes
  • Redshift, Snowflake, Databricks
  • Experience building and maintaining batch and realtime pipelines using technologies like Spark, Kafka
  • Appreciate the importance of schema design, and can evolve an analytics schema on top of unstructured data
  • Excited to try out new technologies and like to produce proof-of-concepts that balance technical advancement and user experience and adoption
  • Like to get deep in the weeds to manage, deploy, and improve low level data infrastructure
  • Empathetic working with stakeholders
Job Responsibility
Job Responsibility
  • Understanding different aspects of the Plaid product and strategy to inform golden dataset choices, design and data usage principles
  • Have data quality and performance top of mind while designing datasets
  • Leading key data engineering projects that drive collaboration across the company
  • Advocating for adopting industry tools and practices at the right time
  • Owning core SQL and python data pipelines that power our data lake and data warehouse
  • Well-documented data with defined dataset quality, uptime, and usefulness
What we offer
What we offer
  • medical
  • dental
  • vision
  • 401(k)
  • equity
  • commission
  • Fulltime
Read More
Arrow Right

Data Engineer

We build simple yet innovative consumer products and developer APIs that shape h...
Location
Location
United States , San Francisco
Salary
Salary:
163200.00 - 223200.00 USD / Year
plaid.com Logo
Plaid
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years of dedicated data engineering experience, solving complex data pipeline issues at scale
  • Experience building data models and data pipelines on top of large datasets (in the order of 500TB to petabytes)
  • Value SQL as a flexible and extensible tool and are comfortable with modern SQL data orchestration tools like DBT, Mode, and Airflow
Job Responsibility
Job Responsibility
  • Understanding different aspects of the Plaid product and strategy to inform golden dataset choices, design and data usage principles
  • Have data quality and performance top of mind while designing datasets
  • Advocating for adopting industry tools and practices at the right time
  • Owning core SQL and Python data pipelines that power our data lake and data warehouse
  • Well-documented data with defined dataset quality, uptime, and usefulness
What we offer
What we offer
  • medical, dental, vision, and 401(k)
  • Fulltime
Read More
Arrow Right

Data Engineer, Enterprise Data, Analytics and Innovation

Are you passionate about building robust data infrastructure and enabling innova...
Location
Location
United States
Salary
Salary:
110000.00 - 125000.00 USD / Year
vaniamgroup.com Logo
Vaniam Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of professional experience in data engineering, ETL, or related roles
  • Strong proficiency in Python and SQL for data engineering
  • Hands-on experience building and maintaining pipelines in a lakehouse or modern data platform
  • Practical understanding of Medallion architectures and layered data design
  • Familiarity with modern data stack tools, including: Spark or PySpark
  • Workflow orchestration (Airflow, dbt, or similar)
  • Testing and observability frameworks
  • Containers (Docker) and Git-based version control
  • Excellent communication skills, problem-solving mindset, and a collaborative approach
Job Responsibility
Job Responsibility
  • Design, build, and operate reliable ETL and ELT pipelines in Python and SQL
  • Manage ingestion into Bronze, standardization and quality in Silver, and curated serving in Gold layers of our Medallion architecture
  • Maintain ingestion from transactional MySQL systems into Vaniam Core to keep production data flows seamless
  • Implement observability, data quality checks, and lineage tracking to ensure trust in all downstream datasets
  • Develop schemas, tables, and views optimized for analytics, APIs, and product use cases
  • Apply and enforce best practices for security, privacy, compliance, and access control, ensuring data integrity across sensitive healthcare domains
  • Maintain clear and consistent documentation for datasets, pipelines, and operating procedures
  • Lead the integration of third-party datasets, client-provided sources, and new product-generated data into Vaniam Core
  • Partner with product and innovation teams to build repeatable processes for onboarding new data streams
  • Ensure harmonization, normalization, and governance across varied data types (scientific, engagement, operational)
What we offer
What we offer
  • 100% remote environment with opportunities for local meet-ups
  • Positive, diverse, and supportive culture
  • Passionate about serving clients focused on Cancer and Blood diseases
  • Investment in you with opportunities for professional growth and personal development through Vaniam Group University
  • Health benefits – medical, dental, vision
  • Generous parental leave benefit
  • Focused on your financial future with a 401(k) Plan and company match
  • Work-Life Balance and Flexibility
  • Flexible Time Off policy for rest and relaxation
  • Volunteer Time Off for community involvement
  • Fulltime
Read More
Arrow Right