This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a skilled Data Engineer II to join our Data Platform team. You will play a key role in building and optimizing our next-generation data infrastructure. Operating at the scale of Flipkart (Petabytes of data), you will design, develop, and maintain high-throughput distributed systems, bridging traditional big data engineering with modern cloud-native and AI-driven workflows.
Job Responsibility:
Design, develop, and maintain scalable ETL/ELT pipelines using Scala and Apache Spark/Flink (Batch & Streaming)
Optimize Spark jobs and SQL queries for performance, efficiency, and cost
Implement and manage Lakehouse architectures using Apache Iceberg, Hudi, or Delta Lake
Apply Medallion Architecture (Bronze/Silver/Gold) for analytics and ML readiness
Implement data quality checks and automated validation (Deequ, Great Expectations)
Enable data observability for freshness, lineage, and reliability
Deploy and manage workloads on GCP Dataproc and Kubernetes
Contribute to infrastructure automation and IaC
Collaborate with architects and product teams in an Agile/Scrum environment
Participate in code reviews and enforce engineering best practices
Explore GenAI and agentic workflows to improve data discovery and productivity
Requirements:
1–5 years of experience as a Data Engineer / Big Data Engineer
Strong proficiency in Scala and Apache Spark (Batch & Streaming)
Solid understanding of SQL and distributed computing concepts
Experience with GCP (Dataproc, GCS, BigQuery) or equivalent cloud platforms (AWS/Azure)
Hands-on experience with Docker and Kubernetes
Experience with Lakehouse table formats (Iceberg, Hudi, Delta)
Understanding of data warehousing and data modeling concepts
Strong problem‑solving and communication skills
Nice to have:
Experience building data pipelines for ML / feature engineering
Exposure to workflow orchestration tools (Airflow, Azkaban)
Experience with real‑time analytics databases (Druid, ClickHouse, HBase)
Knowledge of CI/CD pipelines for data applications
Interest or experience in GenAI / AI‑driven data workflows