Job Description:
Who we're looking for: We are looking for a Senior Data Engineer who brings hands-on depth in core Data Engineering across cloud platforms and solid experience in the Informatica PowerCenter platform (both development and administration). You will design scalable pipelines on modern cloud stacks while also owning the Informatica environment that powers enterprise data workflows. Key Responsibilities: Data Engineering Design, build, and maintain batch and near-real-time ETL/ELT pipelines handling high-volume data with 99%+ SLA adherence. Build ingestion and transformation layers across structured, semi-structured, and unstructured sources (relational DBs, flat files, REST APIs, streaming feeds). Implement medallion architecture (bronze/silver/gold) for lakehouse environments, ensuring clean separation of raw, conformed, and curated data zones. Develop transformation logic using Spark (or equivalent) with proper testing, documentation, and lineage tracking baked in. Build and manage CI/CD pipelines for data workflows: automated testing, environment promotion, and deployment standardization. Work across one or more major cloud platforms: Azure: Data Factory, Synapse Analytics, ADLS Gen2, Databricks, Stream Analytics AWS: Glue, S3, Redshift, Lambda GCP: BigQuery, Dataflow, Cloud Composer Optimize cloud resource usage, tune pipeline performance, and implement cost-aware design patterns. Apply enterprise data modeling techniques: Kimball dimensional modeling, Data Vault, and normalized schemas suited to the use case. Implement data quality checks, validation rules, and anomaly detection within pipelines; integrate with governance tools (Purview, Collibra, or equivalent) for lineage and discoverability. Build near-real-time analytics pipelines using Azure Stream Analytics or equivalent; design event-driven ingestion patterns handling late-arriving data. Partner with Data Scientists to deliver ML-ready feature stores and training data pipelines supporting GenAI/RAG use cases. Informatica PowerCenter Development and Administration Build and maintain PowerCenter mappings, sessions, and workflows; design reusable transformations (Lookup, Joiner, Router, Aggregator, Expression, Update Strategy) with parameterized logic, error handling, and restart/recovery. Install, configure, and upgrade PowerCenter server and client; manage Integration and Repository Services, repository structure, user roles, version control, and backup/recovery. Promote objects across DEV, QA, and PROD environments; troubleshoot session failures and connectivity issues. Technical Skills and Experience: 4+ years of hands-on Data Engineering with production pipelines on at least one major cloud. Proficiency in SQL and at least one scripting language: Python, Shell, or PowerShell. Experience with Azure Data Factory or a comparable orchestration tool (Airflow, dbt). Hands-on with Databricks or Spark for large-scale transformations. 3+ years of Informatica PowerCenter development and administration. Strong understanding of data warehousing: star schema, Data Vault, incremental loads, SCD handling.