CrawlJobs Logo

Data Lake SME

https://www.hpe.com/ Logo

Hewlett Packard Enterprise

Location Icon

Location:
India , Bangalore

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are looking for an experienced Data Lake / ETL Engineer with 7+ years of expertise in designing, developing, and managing large-scale data ingestion, transformation, and analytics pipelines. The role involves building scalable and secure data lake platforms, enabling business insights through efficient ETL/ELT frameworks, and ensuring data quality, performance, and governance across the enterprise ecosystem.

Job Responsibility:

  • Design and implement data ingestion pipelines for structured, semi-structured, and unstructured data
  • Develop and manage ETL/ELT processes for large-scale data processing
  • Optimize storage and retrieval strategies across on-prem and cloud-based data lakes
  • Integrate data from multiple sources (databases, APIs, streaming platforms)
  • Implement real-time and batch processing using Apache Spark, Kafka, or Flink
  • Support metadata management, data lineage, and cataloging
  • Tune queries and pipelines for high performance and cost efficiency
  • Implement partitioning, indexing, and caching strategies for large datasets
  • Automate routine ETL/ELT workflows for reliability and speed
  • Ensure compliance with data governance, privacy, and regulatory standards (GDPR, HIPAA, etc.)
  • Implement encryption, masking, and role-based access control (RBAC)
  • Collaborate with cybersecurity teams to align with Zero Trust and IAM policies
  • Partner with data scientists, analysts, and application teams for analytics enablement
  • Provide L2/L3 support for production pipelines and troubleshoot failures
  • Mentor junior engineers and contribute to best practices documentation

Requirements:

  • 7+ years of experience in data engineering, ETL/ELT development, or data lake management
  • Strong expertise in ETL tools (Informatica, Talend, dbt, SSIS, or similar)
  • Hands-on experience with big data ecosystems: Hadoop, Spark, Hive, Presto, Delta Lake, or Iceberg
  • Proficiency with SQL, Python, or Scala for data processing and transformation
  • Experience with cloud data platforms (AWS Glue, Redshift, Azure Synapse, GCP BigQuery)
  • Familiarity with workflow orchestration tools (Airflow, Temporal, Oozie)
  • Bachelor’s or Master’s degree in Computer Science, IT, or related field

Nice to have:

  • Exposure to real-time data streaming (Kafka, Kinesis, Pulsar)
  • Knowledge of data modeling (Kimball/Inmon), star schema, and dimensional modeling
  • Experience with containerized deployments (Docker, Kubernetes)
  • AWS Certified Data Analytics – Specialty / Azure Data Engineer Associate / GCP Data Engineer
  • Informatica/Talend/dbt certifications
What we offer:
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion

Additional Information:

Job Posted:
March 04, 2026

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Data Lake SME

Data Lake SME

We are looking for an experienced Data Lake / ETL Engineer with 7+ years of expe...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in data engineering, ETL/ELT development, or data lake management
  • Strong expertise in ETL tools (Informatica, Talend, dbt, SSIS, or similar)
  • Hands-on experience with big data ecosystems: Hadoop, Spark, Hive, Presto, Delta Lake, or Iceberg
  • Proficiency with SQL, Python, or Scala for data processing and transformation
  • Experience with cloud data platforms (AWS Glue, Redshift, Azure Synapse, GCP BigQuery)
  • Familiarity with workflow orchestration tools (Airflow, Temporal, Oozie)
Job Responsibility
Job Responsibility
  • Design and implement data ingestion pipelines for structured, semi-structured, and unstructured data
  • Develop and manage ETL/ELT processes for large-scale data processing
  • Optimize storage and retrieval strategies across on-prem and cloud-based data lakes
  • Integrate data from multiple sources (databases, APIs, streaming platforms)
  • Implement real-time and batch processing using Apache Spark, Kafka, or Flink
  • Support metadata management, data lineage, and cataloging
  • Tune queries and pipelines for high performance and cost efficiency
  • Implement partitioning, indexing, and caching strategies for large datasets
  • Automate routine ETL/ELT workflows for reliability and speed
  • Ensure compliance with data governance, privacy, and regulatory standards (GDPR, HIPAA, etc.)
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Pre-Sales Solution Engineer

We are seeking a highly skilled Enterprise focused Pre-Sales Solution Engineer t...
Location
Location
Salary
Salary:
Not provided
lakefs.io Logo
LakeFS
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field
  • 5+ years of experience in a technical pre-sales or solution engineering role, preferably in the software industry
  • Excellent verbal and written communication skills, with the ability to deliver technical presentations to both technical and non-technical audiences
  • Hands-on experience with Kubernetes and containerized environments
  • Experience leading client workshops, sales enablement sessions, and technical training
  • Ability to work independently as well as collaboratively across sales, product, and engineering teams
  • Excellent problem-solving skills and a creative approach to architecting solutions
  • Experience in technical sales of open source products is preferred
  • Strong understanding and hands-on experience in the Data & AI ecosystem, including: Data lake and data warehouse technologies (e.g., S3, Delta Lake, Iceberg, Hive, Glue, Snowflake)
  • Data processing frameworks (e.g., Spark, Databricks, Flink, Trino, Presto)
Job Responsibility
Job Responsibility
  • Own the technical win from discovery through demo, architecture design, pilot/POC, security review, and a clean handoff to Customer Success
  • Design and present tailored demos that showcase branch‑based workflows, reproducible experiments, schema‑safe changes, and instant data rollback
  • Scope and execute pilots with clear success criteria
  • create sample repos, notebooks, and automation (Spark/Databricks jobs, Airflow DAGs, CI pipelines) that prove value quickly
  • Build reference architectures for lakeFS OSS and Cloud across AWS/Azure/GCP
  • document IAM roles, private networking, scaling, GC/performance tuning, and disaster recovery
  • Answer deep technical questions as a lakeFS SME
  • Handle RFPs and security questionnaires
  • map controls to customer requirements and recommend compliant deployment patterns
  • Partner with sales to quantify business impact (risk reduction, developer velocity, storage efficiency) and co‑create the ROI/TCO narrative with champions
Read More
Arrow Right

Senior Technical Product Marketing Manager

Fivetran is the data foundation for AI, enabling enterprises to scale analytics ...
Location
Location
United States , Denver
Salary
Salary:
154283.00 - 19285350.00 USD / Year
fivetran.com Logo
Fivetran
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4-8 years of experience in product marketing, data engineering, product management, sales engineering, and/or data and analytics consulting
  • Data industry experience and expertise: Either hands-on work with or a deep knowledge of data engineering with a focus on data integration, data lakes, open table formats (e.g. Delta Lake, Apache Iceberg), data warehouses, data catalogs and related technologies (e.g. SQL, dbt, python, Spark)
  • Technical aptitude and a desire to learn more: Knowledge of modern data infrastructure and tools, particularly in relation to data lakes and their role in analytics and storage. Has a keen interest in all things data integration and cloud destinations and a willingness to work on new product areas as they arise
  • Experience working with multiple different teams: Proven experience partnering and coaching across Product, Sales, Marketing, and Enablement teams to deliver impactful results
  • Strong in-person, virtual, and written communication: Exceptional verbal and written communication skills, capable of adapting messaging for different technical personas and mediums
  • Highly-organized with an excellent project management track record: Strong project management skills to juggle multiple initiatives and projects, maintain proactive communication with stakeholders, and meet deadlines
  • An understanding of our target customers: The ability to gain a deep understanding of the needs and challenges of data engineers, data scientists, data architects, and technical stakeholders
Job Responsibility
Job Responsibility
  • Content Development & Messaging: Create high-impact technical collateral across the Fivetran platform, including architecture diagrams, demos, white papers, blogs, pitch decks, technical guides, and other customer-facing content. Develop thought leadership that translates complex technical concepts into clear, value-based messaging for technical audiences. Review and edit materials across the product portfolio to ensure technical accuracy, consistency, and strategic alignment
  • Customer & Market Insights: Conduct research on market trends, customer needs, and competitive dynamics across the modern data ecosystem to inform product positioning and messaging. Partner closely with Product Management to deeply understand features, technical capabilities, and customer use cases across multiple product areas. Develop how-to guides, demo videos, case studies, and technical whitepapers that support adoption and expansion across key products
  • Cross-Functional Collaboration: Act as a technical marketing partner to GTM Product Marketers, Demand Generation, Regional Marketing, Sales, and Enablement to support launches, campaigns, webinars, field events, and sales plays across priority product areas. Collaborate with Partner Marketing to showcase integrations and ecosystem partnerships through joint initiatives such as webinars, workshops, and hands-on labs
  • Technical Expertise: Operationalize technical product marketing best practices across teams. Stay up to date with trends in data integration, cloud data platforms, AI, deployment models, governance, etc. Serve as technical SME on the Fivetran platform, articulating how it fits into customers’ broader data stacks and workflows. Provide best practices, reference architectures, and technical validation to support sales cycles and customer conversations
  • Thought Leadership: Represent Fivetran as a platform expert through events, presentations, webinars, and hands-on labs. Support analyst relations and strategic conversations by providing detailed technical insights and validating product innovation across the portfolio. Help position Fivetran as a foundational component of modern, scalable data infrastructure
What we offer
What we offer
  • 100% employer-paid medical insurance
  • Generous paid time-off policy (PTO), plus paid sick time, inclusive parental leave policy, holidays, and volunteer days off
  • RSU stock grants
  • Professional development and training opportunities
  • Company virtual happy hours, free food, and fun team-building activities
  • Monthly cell phone stipend
  • Access to an innovative mental health support platform that offers personalized care and resources in areas such as: therapy, coaching, and self-guided mindfulness exercises for all covered employees and their covered dependents
  • Fulltime
Read More
Arrow Right

Senior Technical Product Marketing Manager

Fivetran is the data foundation for AI, enabling enterprises to scale analytics ...
Location
Location
United States , Oakland
Salary
Salary:
178069.00 - 222586.00 USD / Year
fivetran.com Logo
Fivetran
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4-8 years of experience in product marketing, data engineering, product management, sales engineering, and/or data and analytics consulting
  • Data industry experience and expertise: Either hands-on work with or a deep knowledge of data engineering with a focus on data integration, data lakes, open table formats (e.g. Delta Lake, Apache Iceberg), data warehouses, data catalogs and related technologies (e.g. SQL, dbt, python, Spark)
  • Technical aptitude and a desire to learn more: Knowledge of modern data infrastructure and tools, particularly in relation to data lakes and their role in analytics and storage. Has a keen interest in all things data integration and cloud destinations and a willingness to work on new product areas as they arise
  • Experience working with multiple different teams: Proven experience partnering and coaching across Product, Sales, Marketing, and Enablement teams to deliver impactful results
  • Strong in-person, virtual, and written communication: Exceptional verbal and written communication skills, capable of adapting messaging for different technical personas and mediums
  • Highly-organized with an excellent project management track record: Strong project management skills to juggle multiple initiatives and projects, maintain proactive communication with stakeholders, and meet deadlines
  • An understanding of our target customers: The ability to gain a deep understanding of the needs and challenges of data engineers, data scientists, data architects, and technical stakeholders
Job Responsibility
Job Responsibility
  • Content Development & Messaging: Create high-impact technical collateral across the Fivetran platform, including architecture diagrams, demos, white papers, blogs, pitch decks, technical guides, and other customer-facing content. Develop thought leadership that translates complex technical concepts into clear, value-based messaging for technical audiences. Review and edit materials across the product portfolio to ensure technical accuracy, consistency, and strategic alignment
  • Customer & Market Insights: Conduct research on market trends, customer needs, and competitive dynamics across the modern data ecosystem to inform product positioning and messaging. Partner closely with Product Management to deeply understand features, technical capabilities, and customer use cases across multiple product areas. Develop how-to guides, demo videos, case studies, and technical whitepapers that support adoption and expansion across key products
  • Cross-Functional Collaboration: Act as a technical marketing partner to GTM Product Marketers, Demand Generation, Regional Marketing, Sales, and Enablement to support launches, campaigns, webinars, field events, and sales plays across priority product areas. Collaborate with Partner Marketing to showcase integrations and ecosystem partnerships through joint initiatives such as webinars, workshops, and hands-on labs
  • Technical Expertise: Operationalize technical product marketing best practices across teams. Stay up to date with trends in data integration, cloud data platforms, AI, deployment models, governance, etc. Serve as technical SME on the Fivetran platform, articulating how it fits into customers’ broader data stacks and workflows. Provide best practices, reference architectures, and technical validation to support sales cycles and customer conversations
  • Thought Leadership: Represent Fivetran as a platform expert through events, presentations, webinars, and hands-on labs. Support analyst relations and strategic conversations by providing detailed technical insights and validating product innovation across the portfolio. Help position Fivetran as a foundational component of modern, scalable data infrastructure
What we offer
What we offer
  • 100% employer-paid medical insurance*
  • Generous paid time-off policy (PTO), plus paid sick time, inclusive parental leave policy, holidays, and volunteer days off
  • RSU stock grants*
  • Professional development and training opportunities
  • Company virtual happy hours, free food, and fun team-building activities
  • Monthly cell phone stipend
  • Access to an innovative mental health support platform that offers personalized care and resources in areas such as: therapy, coaching, and self-guided mindfulness exercises for all covered employees and their covered dependents
  • Fulltime
Read More
Arrow Right

Senior Bigdata Engineer

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8 - 10 years of relevant experience
  • Experience in systems analysis and programming of software applications
  • Experience in managing and implementing successful projects
  • Working knowledge of consulting/project management techniques/methods
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Programming Languages: Python, PySpark
  • Data Lake Table Format: Apache Iceberg
  • Data Orchestration: Apache Airflow
  • Data Visualization: Tableau
  • Big Data Processing: Apache Spark
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
What we offer
What we offer
  • Equal employment opportunity
  • Fulltime
Read More
Arrow Right

Senior Cloud Data Architect

As a Senior Cloud Architect, your role will focus on supporting users, collabora...
Location
Location
Spain , Barcelona
Salary
Salary:
Not provided
https://www.allianz.com Logo
Allianz
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong expertise in Azure cloud infrastructure, Data & AI technologies, and data platform management, with proficiency in Azure Synapse Analytics, Azure Machine Learning, Azure Data Lake, and Informatica Intelligent Data Management Cloud (IDMC)
  • Proven experience in modern Data Warehouse architectures (e.g., Lakehouse) and integrating machine learning models and AI capabilities using Azure services like Cognitive Services and Azure Bot Service for predictive analytics and automation
  • In-depth knowledge of data security and compliance practices using Azure AD, Azure Key Vault, and Informatica’s data governance tools, focusing on data privacy and regulatory standards
  • Expertise in optimizing resource usage, performance, and costs across Azure services and IDMC, leveraging tools like Azure Cost Management and Azure Monitor, and skilled in ETL/ELT tools and advanced SQL
  • Proficiency in data integration, machine learning, and generative AI from an architectural perspective, with hands-on experience in Python, SQL, Spark/Scala/PySpark, and container solutions like Docker and Kubernetes
  • Experience with CI/CD pipelines (e.g., GitHub Actions, Jenkins), microservices architectures, and APIs, with knowledge of architecture frameworks like TOGAF or Zachman, adept at managing multiple priorities in fast-paced environments, and excellent communication and presentation skills
  • Over 5 years of experience in cloud architecture focusing on Data & AI infrastructure, particularly in Azure, with expertise in building scalable, secure, and cost-effective solutions for data analytics and AI/ML environments.
Job Responsibility
Job Responsibility
  • Define and prioritize new functional and non-functional capabilities for the cloud-based data platform, ensuring alignment with business needs and Allianz's security, compliance, privacy, and architecture standards
  • Act as the platform SME for both potential and existing users, guiding them in the architecture of scalable, high-performance Data & AI solutions
  • Provide leadership and product guidance to engineering teams during the design, development, and implementation of new platform capabilities
  • Ensure all solutions meet defined quality standards and acceptance criteria
  • Work with stakeholders to co-create data solutions, optimizing business models and identifying opportunities for improved data usage
  • Lead the evaluation and selection of technologies and partners to implement data analytics use cases, focusing on proofs of concept and prototypes
  • Stay up to date with emerging trends in Data, Analytics, AI/ML, and cloud technologies
  • Leverage open-source technologies and cloud tools to drive innovation and cost-efficiency
  • Prepare materials for management briefings and public events
  • Represent the team in technical discussions, particularly regarding architecture and platform capabilities.
What we offer
What we offer
  • Hybrid work model which recognizes the value of striking a balance between in-person collaboration and remote working incl. up to 25 days per year working from abroad
  • Rewarding performance through company bonus scheme, pension, employee shares program, and multiple employee discounts
  • Career development and digital learning programs to international career mobility
  • Flexible working, health and wellbeing offers (including healthcare and parental leave benefits)
  • Support for balancing family and career and helping employees return from career breaks with experience that nothing else can teach.
  • Fulltime
Read More
Arrow Right

Principal Engineer I – Senior Azure Databricks Administrator

Software Resources has an immediate, direct hire job opportunity for a Principal...
Location
Location
United States , Phoenix
Salary
Salary:
Not provided
softwareresources.com Logo
Software Resources
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of related experience in data analytics administration and development
  • 4+ years of Databricks related experience
  • Bachelor’s degree in related field required
  • Advanced proven experience in Azure Databricks (Workspace management, Clusters, Jobs, Unity Catalog, Delta Lake, User access management, Rest APIs and SDKs)
  • Knowledge of MLFlow & MLOps
  • Deep understanding of Azure infrastructure and data services, including Azure Data Lake, Azure Data Factory, Azure SQL, Azure Synapse Analytics, Azure Key Vault, Azure Monitor, networking
  • Experience with CI/CD pipelines (Azure DevOps preferred)
  • Strong programming skills in SQL, Python, and/or PySpark
  • Advanced proven experience in leading cross-functional teams and managing multiple projects simultaneously
  • Advanced ability to see the big picture and align projects with organizational goals
Job Responsibility
Job Responsibility
  • Responsible for delivery and operations of technologies and platforms required to model, transform, analyze, report, visualize data
  • Provide SME expertise in designing, building, optimizing, streamlining and automating the Azure Databricks platform
  • Partner with ML engineers, data scientists, data analysts, and enterprise architects to provide frameworks, set standards, enforce best practices, train and enable users
  • Develop technical skills of one or more junior team-members
  • Take assignments that can be worked on individually without supervision, and manage work effort from concept to completion
  • Design, build, optimize, automate and maintain the Azure Databricks platform, ensuring scalability, security, governance and performance
  • Design, implement and manage Azure Databricks workspaces, clusters, jobs, access management
  • Design, implement and manage policies, monitoring and observability
  • Implement data analytics principles aimed at business enablement, reliability practices and sound recovery procedures
  • Ensure compliance with IT policies, procedures, and industry standards
What we offer
What we offer
  • Competitive salaries
  • An ownership stake in the company
  • Medical and dental insurance
  • Time off
  • A great 401k matching program
  • Tuition assistance program
  • An employee volunteer program
  • A wellness program
  • Fulltime
Read More
Arrow Right

Solution Architect

The Solution Architect role involves driving the architectural transformation fo...
Location
Location
Ireland , Dublin
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Significant experience in Data modeling, Data lineage analysis, Operational reporting, preferably in a global organization
  • Proven architecture experience in solutioning of horizontally scalable, highly available, highly resilient data distribution platforms
  • Proficient in message queuing, stream processing, and highly scalable ‘big data’ data stores
  • Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases
  • Experience performing root cause analysis on internal and external data and processes to answer specific business questions and identify opportunities for improvement
  • Strong analytic skills related to working with unstructured datasets
  • Extensive experience with Data Integration patterns
  • Extensive experience with Real/Near Real time streaming patterns
  • Strong background in Data Management, Data Governance, Transformation initiatives preferred
  • Preferred Experience/Familiarity with one or more of these tools: Big data platforms - Hadoop, Apache Kafka, Relational SQL, NoSQL, and Cloud Native databases - Postgres, Cassandra, Snowflake, Experience with data pipeline and orchestration tools - Azkaban, Luigi, or Airflow, Experience with stream-processing engines - Apache Spark, Apache Storm, or Apache Flink and ETL tools - Talend, Ab Initio, Experience with Data Analytics/visualization tools - Looker, Mode, or Tableau
Job Responsibility
Job Responsibility
  • Re-engineering the interaction of incoming and outgoing data flows from the Core Accounts DDA platform to Reference Data platforms, Data Warehouse, Data Lake as well as other local reporting systems which consume data from Core Accounts
  • Drive data architecture and roadmap for eliminating non-strategic point-to-point connections and batch handoffs
  • Define canonical data models for key entities and events related to Customer, Account, Core DDA in line with the Data Standards
  • Assess opportunities to simplify/rationalize/refactor the existing database schemas paving way for modularization of the existing stack
  • Provide technical guidance to Data Engineers responsible for designing an Operational Data Store for intra-day and end-of-day reporting
  • Implementing data strategies and developing logical and physical data models
  • Formulate efficient approach to rationalize and formulate strategy to migrate reports
  • Build and nurture a strong engineering organization to deliver value to internal and external clients
  • Acts as SME to senior stakeholders in business, operations, and technology divisions across upstream and downstream Organizations
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
What we offer
What we offer
  • Competitive base salary (annually reviewed)
  • Hybrid working model (up to 2 days working at home per week)
  • Additional benefits supporting you and your family
  • Fulltime
Read More
Arrow Right