Data Engineer Sr (Databricks) Job at NTT DATA (Bangalore)

Job Responsibility

Develop Databricks notebooks, jobs, and workflows to replicate and enhance DB2/Guidewire-based pipelines and transformations
Implement Delta Lake tables and patterns (bronze/silver/gold, ACID, time travel, schema evolution) for migrated data
Integrate Databricks with AWS/S3 or Azure ADLS, ADF/Synapse, Key Vault, and Snowflake as required
Optimize Databricks clusters, jobs, and queries for performance and cost
Implement incremental loads, CDC patterns, and batch schedules for large datasets
Collaborate with Snowflake and dbt teams to ensure consistent data models and data contracts
Participate in data validation and reconciliation between DB2 400 / Guiderwire and Databricks outputs
Follow coding standards, version control, and CI/CD practices using Git/Azure DevOps
Provide defect fixes and support during SIT/UAT and post go-live stabilization

Requirements

8+ years of experience in databricks with understanding complex legacy data models and getting that data into the cloud: Extracting DB2/AS400: Experience with Change Data Capture (CDC) or scheduled batch extractions from DB2 into cloud storage. Involves working through JDBC connections, mapping table dependencies, and re-platforming legacy SQL to distributed computing standards
Handling Guidewire Data: Integrating with Guidewire Cloud Data Access (CDA) or InsuranceSuite to replicate complex P&C (Property & Casualty) insurance schemas. Senior engineers parse these highly normalized operational databases and transform them into analytical-friendly schemas in the cloud
2. Architecture & Pipeline Development: The core of the experience involves transitioning these legacy, row-based stores into a scalable Medallion Architecture (Bronze, Silver, Gold layers)
Delta Lake Optimization: Using Databricks and Apache Spark to build ETL/ELT data pipelines with ACID transactions. Senior engineers handle schema evolution, upserts, and slowly changing dimensions (SCD Type 2)
Business Logic Refactoring: Translating rigid legacy procedural code (e.g., RPG/COBOL background logic, stored procedures) into scalable distributed patterns (PySpark, Spark SQL, and Scala)
3. Data Governance & Observability: A senior engineer is expected to govern vast amounts of incoming and generated data across the enterprise
Unity Catalog: Implementing strict data governance, lineage tracing, and table-level security
Data Quality: Automating data validation frameworks to ensure a seamless transition from legacy to modern systems without data loss or corruption
4. Integration with the Databricks Platform Ecosystem: Moving beyond basic storage to utilizing the full power of the Databricks Data Intelligence Platform
Serverless Compute: Managing Databricks serverless resources, ensuring optimal cluster sizing, and reducing compute costs
Streaming and Batch Workflows: Building event-driven pipelines using features like Databricks Auto Loader to ingest flat files and streaming records directly into Delta tables

NTT DATA - All Job Offers

Select Country

Data Engineer Sr (Databricks)

Job Responsibility

Requirements

Looking for more opportunities?

Data Engineer Sr (Databricks)

Sr. Data Engineer – Clinical Data Foundation

Sr Data Engineer

Sr Data Engineer

Sr Engineer, Data

Sr. Data Engineer

Sr. Data Engineer III

Sr. Data Engineer - Python Developer

Sr Data Engineer

Our AI answers in your language