CrawlJobs Logo

Data Engineer Specialist

India, Pune · Job Posted March 19, 2026
Apply Position
Job Link Share

Job Description

We are seeking an experienced Data Engineering Specialist with strong hands-on expertise in Databricks on GCP, Python-based data engineering, and Spark processing. The individual will design, build, and optimise large-scale data pipelines across the GCP ecosystem, applying robust engineering practices, data quality frameworks, and cost-optimised solutions.

Job Responsibility

  • Design and build data pipelines on GCP using Databricks (Delta Lake and Unity Catalog) for orchestration, Dataproc for Spark execution, supporting both ETL/ELT and feature engineering workloads
  • Engineer declarative, modular, and reusable pipelines in Python, following configuration-as-code principles and CI/CD practices including Git-based promotion, testing, and deployment
  • Implement and maintain data quality and observability practices using validation frameworks, logging, metrics, and alerts
  • Optimise pipeline performance, reliability, and cost through techniques such as cluster sizing, auto-termination, Z-ordering, caching, and partitioning strategies
  • Apply robust error handling, parameterisation, and triggers within Cloud Data Fusion pipelines
  • Ensure operational excellence by maintaining monitoring, performance tuning, and continuous improvements across data products and workloads

Requirements

  • Strong expertise in Databricks on GCP including Delta Lake, notebooks/jobs, Unity Catalog, and cluster policies
  • Experienced in Cloud Data Fusion design, including pipeline management, error handling, and orchestration
  • Skilled in Dataproc Spark with experience building PySpark jobs, configuring ephemeral clusters, and handling initialisation actions
  • Proficient in Python for data engineering including packaging, unit testing, type hints, and linting
  • Strong SQL skills, specifically with BigQuery including performance tuning, partitioning, and clustering
  • Familiar with GCP services such as Cloud Storage, Pub/Sub, and Cloud Composer/Airflow
  • Holds a qualification such as B.E., B.Tech, BCA, MCA, BSc, or MSc in Computer Science or a related field

What we offer

  • The opportunity to build and scale data solutions using leading GCP and Databricks technologies
  • Exposure to enterprise-level CI/CD, observability, and configuration-as-code practices
  • A collaborative environment where innovation, continuous learning, and technical excellence are encouraged
  • The chance to contribute to high-impact global data platforms

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Data Engineer Specialist

8 matching positions

Data Engineer Specialist

As a Data Engineer Specialist, you’ll shape the backbone of our data strategy—de...
Location
Location
Brazil
Salary
Salary:
Not provided
mindbodyonline.com Logo
Mindbody
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong expertise in data engineering, platform architecture, or modern data warehousing
  • Deep expertise in Snowflake (Snowpark, Streams, Tasks, Data Sharing) and strong SQL/dbt fluency
  • Proficient in Python for automation, transformation logic, and data platform development
  • Experience implementing observability standards, anomaly detection, and data quality validation
  • Skilled in managing complex, multi-squad initiatives with a clear track record of driving scalable outcomes
  • Hands-on knowledge of Airflow, Atlan, and infrastructure as code (Pulumi or Terraform)
  • A strong grasp of data modeling principles and architecture patterns like star schema and bronze/silver/gold layering
  • Comfortable mentoring engineers, influencing technical roadmaps, and advocating for best practices
Job Responsibility
Job Responsibility
  • Architect and optimize cloud-native data warehouse solutions that power real-time decisions and long-term strategy
  • Design and scale dbt models across bronze/silver layers with a focus on data contracts, observability, and semantic consistency
  • Build and maintain robust ETL/ELT pipelines using tools like Fivetran, Snowpipe, and CDC to support streaming and batch workloads
  • Lead the design of infrastructure-as-code deployments with Pulumi or Terraform, driving CI/CD maturity and cost-efficient operations
  • Elevate data quality with comprehensive monitoring, anomaly detection, and validation frameworks
  • Collaborate across squads to align data contracts, expand shared data marts, and support machine learning enablement
  • Champion data governance, ensuring privacy and compliance through PII tagging, lineage, and role-based access controls
  • Drive continuous platform improvement through systematic tech debt remediation and modernization initiatives
  • Fulltime
Read More
Arrow Right

Aws Data Engineer (Cloud Data Platform & Pipeline Specialist)

Design, develop, and maintain scalable cloud-based data pipelines using AWS serv...
Location
Location
United States , Atlanta
Salary
Salary:
Not provided
nttdata.com Logo
NTT DATA
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in data engineering, with strong hands-on expertise in AWS data services (Glue, EMR, S3, RDS, DataSync, DMS)
  • 5+ years of Proven experience building and managing data pipelines (batch and streaming) in cloud environments
  • 5+ years of Strong experience in data migration, transformation frameworks, and large-scale data replication
  • 5+ years of Deep understanding of data modeling, data transformation, and reconciliation techniques
  • 5+ years of Experience designing and implementing secure data access and governance (least privilege principles)
  • 5+ years of Hands-on experience with data validation, auditing, and reconciliation processes
  • Familiarity with regulatory or finance data environments and reporting workloads
  • 5+ years of Strong problem-solving skills and ability to work in a collaborative, fast-paced environment
  • AWS data services
  • data pipelines
Job Responsibility
Job Responsibility
  • Design, develop, and maintain scalable cloud-based data pipelines using AWS services such as Glue, EMR, S3, RDS, DataSync, and DMS
  • Build and optimize batch and streaming data orchestration workflows to support enterprise data platforms
  • Lead large-scale data migration efforts, including legacy-to-cloud transformations and replication strategies
  • Perform data modeling, transformation, and reconciliation to ensure high-quality, consistent datasets across systems
  • Implement secure data access patterns following least-privilege principles for pipelines and datasets
  • Collaborate with data architects, analysts, and business stakeholders to understand data requirements and deliver solutions
  • Establish robust data validation, reconciliation, and audit mechanisms to meet regulatory and reporting requirements
  • Troubleshoot and optimize performance of ETL/ELT pipelines and data workflows in AWS environments
  • Support governance, compliance, and audit readiness for data platforms in regulated environments (finance/reporting)
  • Fulltime
Read More
Arrow Right

Ecom Data Engineer Specialist

PepsiCo operates in an environment undergoing immense and rapid change. Big data...
Location
Location
United States , Purchase, New York
Salary
Salary:
64900.00 - 132550.00 USD / Year
pepsico.com Logo
Pepsico
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years of overall technology experience that includes at least 3+ years of hands-on software development, data engineering, and systems architecture
  • 3+ years of experience in SQL optimization and performance tuning
  • Experience with data modeling, data warehousing, and building high-volume ETL/ELT pipelines
  • Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
  • Experience with data profiling and data quality tools like Apache Griffin, Deequ, or Great Expectations
  • Current skills in the following technologies: Python
  • Orchestration platforms: Airflow, Luigi, Databricks, or similar
  • Relational databases: Postgres, MySQL, or equivalents
  • MPP data systems: Snowflake, Redshift, Synapse, or similar
  • Cloud platforms: AWS, Azure, or similar
Job Responsibility
Job Responsibility
  • Own data pipeline development end-to-end, spanning data modeling, testing, scalability, operability, and ongoing metrics
  • Ensure that we build high-quality software by reviewing peer code check-ins
  • Define best practices for product development, engineering, and coding as part of a world-class engineering team
  • Collaborate in architecture discussions and architectural decision-making that is part of continually improving and expanding these platforms
  • Lead feature development in collaboration with other engineers
  • validate requirements/stories, assess current system capabilities, and decompose feature requirements into engineering tasks
  • Focus on delivering high-quality data pipelines and tools through careful analysis of system capabilities and feature requests, peer reviews, test automation, and collaboration with other engineers
  • Develop software in short iterations to quickly add business value
  • Introduce new tools/practices to improve data and code quality
  • this includes researching/sourcing 3rd party tools and libraries, as well as developing tools in-house to improve workflow and quality for all data engineers
What we offer
What we offer
  • A business development incentive equity may be awarded based on eligibility and performance
  • Paid time off subject to eligibility, including paid parental leave, vacation, sick, and bereavement
  • Medical, Dental, Vision, Disability, Health, and Dependent Care Reimbursement Accounts, Employee Assistance Program (EAP), Insurance (Accident, Group Legal, Life), Defined Contribution Retirement Plan
  • Fulltime
Read More
Arrow Right

Specialist - Data Engineer

Location
Location
India , Chennai
Salary
Salary:
Not provided
nordex-online.com Logo
Nordex Group
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • University degree in computer science, Engineering, or a related technical field
  • 8 – 10+ year experience as a data engineer or in a similar data-intensive role
  • Proficient knowledge in ETL, Data Lake, and data warehousing
  • Proficient knowledge/skills: in SQL and relational databases preferably SQL Server, SSIS, SSAS (OLAP Cubes), ADF
  • Good programming skills in languages such as R, Python, Scala, Databricks (including spark, delta tables, unity catalog)
  • Knowledge in data governance, security, architecture
  • Experience with CI/CD workflows preferably azure devOPS and git
  • Strong problem-solving and analytical skills with attention to detail
  • Good stakeholder management, excellent communication and interpersonal skills to collaborate with internal and external stakeholders
Job Responsibility
Job Responsibility
  • Build and maintain data pipelines to transform, process and ingest wind measurement and master data from multiple sources and data formats to drive analytical insights
  • Develop and optimize data models and schemas for storage and retrieval of structured and unstructured data
  • Perform analysis and troubleshooting to identify and resolve issues with data pipelines and infrastructure
  • Implement data quality and governance processes to ensure the accuracy, consistency, and reliability of data
Read More
Arrow Right

Data Engineer / Integrations Specialist

We are seeking a Data Engineer to own the data layer of a growing AI-powered pla...
Location
Location
Mexico
Salary
Salary:
Not provided
techholding.co Logo
Tech Holding
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years working with Python in a data engineering or backend integration context
  • Hands-on experience building data pipelines and ETL processes, extracting, transforming, and loading data between systems
  • Proven experience integrating third-party REST APIs, auth, rate limits, retries, error handling
  • Strong understanding of data quality: validation, deduplication, schema management, error recovery
  • Comfortable owning a data track end-to-end: design → build → ship → monitor
  • Can read API documentation and figure things out independently
  • Strong async Python
Job Responsibility
Job Responsibility
  • Design, build, and maintain ETL processes, data pipelines, and API integrations that ensure data is accurate, consistent, and available across multiple customer environments
  • Build data sync pipeline for a live customer: crawl ERP products, customers, and pricing data into our validation layer
  • Build ERP connector with REST API integration — auth, retries, timeouts, error handling
  • Build durable workflow: validate extracted order data against ERP reference data, submit on approval
  • Implement data quality checks and monitoring for the sync pipeline
  • Develop generalized ERP adapter pattern
  • Implement improved validation: confidence scoring, auto-submission rules, exception handling
  • Develop schema extensions as new customer requirements and ERP platforms surface
  • Build data reconciliation tooling across multiple customer environments
What we offer
What we offer
  • Remote opportunity with collaborative team culture
  • Exposure to cloud-first environments and modern DevOps tooling
  • Opportunities for growth and cross-functional impact
  • Dynamic and fast-paced engineering environment
Read More
Arrow Right

Clinical Data Validation Engineer Specialist

We are currently seeking a Senior Clinical Data Science Programmer to join our d...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
iconplc.com Logo
iconplc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Advanced degree in a relevant field such as computer science, statistics, or life sciences
  • Extensive experience in programming for clinical trials, with proficiency in languages such as SAS, R, or Python
  • Strong problem-solving skills and the ability to work collaboratively in a fast-paced, cross-functional environment
  • Excellent attention to detail and organizational skills, with a commitment to delivering high-quality results
  • Strong communication and interpersonal skills, with the ability to effectively collaborate with diverse teams and influence outcomes
Job Responsibility
Job Responsibility
  • Developing, validating, and maintaining programming solutions for data analysis and reporting in clinical trials
  • Collaborating with clinical data scientists and biostatisticians to ensure the integration of programming solutions into the overall data management process
  • Overseeing the generation of statistical datasets, tables, listings, and figures to support regulatory submissions and study reports
  • Providing guidance on programming best practices, coding standards, and data quality control measures
  • Staying updated on advancements in programming languages and data management tools to enhance operational efficiencies
What we offer
What we offer
  • Various annual leave entitlements
  • A range of health insurance offerings to suit you and your family’s needs
  • Competitive retirement planning offerings to maximize savings and plan with confidence for the years ahead
  • Global Employee Assistance Programme, LifeWorks, offering 24-hour access to a global network of over 80,000 independent specialized professionals who are there to support you and your family’s well-being
  • Life assurance
  • Flexible country-specific optional benefits, including childcare vouchers, bike purchase schemes, discounted gym memberships, subsidized travel passes, health assessments, among others
Read More
Arrow Right

Data Engineer and AI Specialist

Location
Location
India , Tirupati, Bangalore
Salary
Salary:
Not provided
sithafal.com Logo
Sithafal
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s Degree in an Information Technology or related program
  • Previous professional experience as a data engineer or in a similar role
  • Prior hands-on experience with Druid AI platform
  • Prior work experience in designing and building complex machine learning systems for natural language processing applications
  • Advanced familiarity with Kubernetes (AKS) and NGINX ingress controller for Kubernetes
  • Advanced knowledge in Python or another programming language and at least one machine learning framework
  • Prior experience in developing training materials and providing training to the user groups
  • Technical expertise with data models, data mining, and segmentation techniques
  • Ability to obtain Public Trust Clearance
Job Responsibility
Job Responsibility
  • Integration of AI Chatbot system
  • Updating knowledge articles within the AI and ability to train the solution
  • Capability to employ a variety of qualitative and quantitative analysis techniques to continually improve the user experience
  • Consistent collaboration with cross-functional teams and stakeholders
  • Ability to improve the scalability & performance of machine learning systems that handle high loads
  • Ability to analyze and interpret raw data
  • Developing and maintaining datasets
  • Improving data quality and efficiency
  • Prior hands-on experience working with Druid AI platform
  • Previous experience maintaining and enhancing Druid Artificial Intelligence (AI) Chatbot capability within an application
  • Fulltime
Read More
Arrow Right

Lead Data Integration Specialist / Senior Full Stack Engineer

Lead Data Integration Specialist / Senior Full Stack Engineer – New York – Compe...
Location
Location
United States , New York
Salary
Salary:
Not provided
weareorbis.com Logo
Orbis Consultants
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of hands-on experience in developing production-ready software
  • Experience maintaining and working with data integrations / external API sources
  • Demonstrates skill in manoeuvring both front-end and backend technical projects
  • Brings to the table a collaborative mindset, having effectively led engineering teams
  • Demonstrates a remarkable ability to adapt swiftly to the evolving needs of our growing organization
  • Proficient in client-side technologies such as TypeScript, JavaScript (ES6/React), HTML/CSS
  • Server-side proficiency in Python (Django)
  • Holds practical experience in managing relational databases, with a strong command over PostgreSQL
Job Responsibility
Job Responsibility
  • Design and build a scalable platform that simplifies the creation/operation of hundreds of data partner integrations
  • Liaise with engineers, designers, and product managers to translate our product and technical vision into a concrete roadmap
  • Partner with third-party vendors & our clients to gather requirements and co-create solutions
  • Craft high-quality, thoroughly-tested code that meets the unique requirements of our clients
  • Provide technical mentorship and guidance to fellow engineers
What we offer
What we offer
  • Competitive Package
  • Unlimited PTO and flexible work policy
  • Fulltime
Read More
Arrow Right