Data Lake SME, Hewlett Packard Enterprise

Hewlett Packard Enterprise

Location:
India, Bangalore

Category:
IT - Software Development

Contract Type:
Employment contract

Salary:

Not provided

Save Job

Apply Position

Job Description:

We are looking for an experienced Data Lake / ETL Engineer with 7+ years of expertise in designing, developing, and managing large-scale data ingestion, transformation, and analytics pipelines. The role involves building scalable and secure data lake platforms, enabling business insights through efficient ETL/ELT frameworks, and ensuring data quality, performance, and governance across the enterprise ecosystem.

Job Responsibility:

Design and implement data ingestion pipelines for structured, semi-structured, and unstructured data
Develop and manage ETL/ELT processes for large-scale data processing
Optimize storage and retrieval strategies across on-prem and cloud-based data lakes
Integrate data from multiple sources (databases, APIs, streaming platforms)
Implement real-time and batch processing using Apache Spark, Kafka, or Flink
Support metadata management, data lineage, and cataloging
Tune queries and pipelines for high performance and cost efficiency
Implement partitioning, indexing, and caching strategies for large datasets
Automate routine ETL/ELT workflows for reliability and speed
Ensure compliance with data governance, privacy, and regulatory standards (GDPR, HIPAA, etc.)
Implement encryption, masking, and role-based access control (RBAC)
Collaborate with cybersecurity teams to align with Zero Trust and IAM policies
Partner with data scientists, analysts, and application teams for analytics enablement
Provide L2/L3 support for production pipelines and troubleshoot failures
Mentor junior engineers and contribute to best practices documentation

Requirements:

7+ years of experience in data engineering, ETL/ELT development, or data lake management
Strong expertise in ETL tools (Informatica, Talend, dbt, SSIS, or similar)
Hands-on experience with big data ecosystems: Hadoop, Spark, Hive, Presto, Delta Lake, or Iceberg
Proficiency with SQL, Python, or Scala for data processing and transformation
Experience with cloud data platforms (AWS Glue, Redshift, Azure Synapse, GCP BigQuery)
Familiarity with workflow orchestration tools (Airflow, Temporal, Oozie)

Nice to have:

Exposure to real-time data streaming (Kafka, Kinesis, Pulsar)
Knowledge of data modeling (Kimball/Inmon), star schema, and dimensional modeling
Experience with containerized deployments (Docker, Kubernetes)
Informatica/Talend/dbt certifications

What we offer:

Health & Wellbeing
Personal & Professional Development
Unconditional Inclusion

Additional Information:

Job Posted:
October 08, 2025

Employment Type:

Fulltime

Work Type:

On-site work

View All Jobs In This Company

Job Link Share:

Data Lake SME