This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Develop Databricks notebooks, jobs, and workflows to replicate and enhance DB2/Guidewire-based pipelines and transformations
Implement Delta Lake tables and patterns (bronze/silver/gold, ACID, time travel, schema evolution) for migrated data
Integrate Databricks with AWS/S3 or Azure ADLS, ADF/Synapse, Key Vault, and Snowflake as required
Optimize Databricks clusters, jobs, and queries for performance and cost
Implement incremental loads, CDC patterns, and batch schedules for large datasets
Collaborate with Snowflake and dbt teams to ensure consistent data models and data contracts
Participate in data validation and reconciliation between DB2 400 / Guiderwire and Databricks outputs
Follow coding standards, version control, and CI/CD practices using Git/Azure DevOps
Provide defect fixes and support during SIT/UAT and post go-live stabilization
Requirements
8+ years of experience in databricks with understanding complex legacy data models and getting that data into the cloud: Extracting DB2/AS400: Experience with Change Data Capture (CDC) or scheduled batch extractions from DB2 into cloud storage. Involves working through JDBC connections, mapping table dependencies, and re-platforming legacy SQL to distributed computing standards
Handling Guidewire Data: Integrating with Guidewire Cloud Data Access (CDA) or InsuranceSuite to replicate complex P&C (Property & Casualty) insurance schemas. Senior engineers parse these highly normalized operational databases and transform them into analytical-friendly schemas in the cloud
2. Architecture & Pipeline Development: The core of the experience involves transitioning these legacy, row-based stores into a scalable Medallion Architecture (Bronze, Silver, Gold layers)
Delta Lake Optimization: Using Databricks and Apache Spark to build ETL/ELT data pipelines with ACID transactions. Senior engineers handle schema evolution, upserts, and slowly changing dimensions (SCD Type 2)
Business Logic Refactoring: Translating rigid legacy procedural code (e.g., RPG/COBOL background logic, stored procedures) into scalable distributed patterns (PySpark, Spark SQL, and Scala)
3. Data Governance & Observability: A senior engineer is expected to govern vast amounts of incoming and generated data across the enterprise
Unity Catalog: Implementing strict data governance, lineage tracing, and table-level security
Data Quality: Automating data validation frameworks to ensure a seamless transition from legacy to modern systems without data loss or corruption
4. Integration with the Databricks Platform Ecosystem: Moving beyond basic storage to utilizing the full power of the Databricks Data Intelligence Platform
Streaming and Batch Workflows: Building event-driven pipelines using features like Databricks Auto Loader to ingest flat files and streaming records directly into Delta tables