Senior Data Engineer, Citi

Citi

Location:
India, Chennai

Category:
IT - Software Development

Contract Type:
Employment contract

Salary:

Not provided

Save Job

Apply Position

Job Description:

The Senior Data Engineer will be responsible for the architecture, design, development, and maintenance of data platforms, focusing on Python and PySpark for data processing and transformation. This role demands a strong technical leader with expertise in data engineering and collaboration across teams.

Job Responsibility:

Design, develop, and optimize data architectures, pipelines, and data models
Build, test, and deploy scalable ETL/ELT processes using Python and PySpark
Implement practices for data quality, governance, and security
Monitor, troubleshoot, and optimize data pipeline performance
Collaborate with DevOps and MLOps teams to manage data infrastructure
Provide guidance and mentorship to junior data engineers
Work closely with stakeholders to understand data requirements
Research and evaluate new data technologies, tools, and methodologies
Create and maintain comprehensive documentation for data pipelines and models

Requirements:

Bachelor's or Master's degree in Computer Science, Software Engineering, Data Science, or a related quantitative field
5+ years of professional experience in data engineering
Extensive hands-on experience with Python for data engineering tasks
Proven experience with PySpark for big data processing and transformation
Proven experience with cloud data platforms (e.g., AWS Redshift, S3, EMR, Glue
Azure Data Lake, Databricks, Synapse
Google BigQuery, Dataflow)
Strong experience with SQL and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB, Cassandra)
Extensive experience with distributed data processing frameworks, especially Apache Spark
Expert proficiency in Python is mandatory
Strong SQL mastery is essential
Familiarity with Scala or Java is a plus
In-depth knowledge of data warehousing concepts, dimensional modeling, and ETL/ELT processes
Hands-on experience with at least one major cloud provider (AWS, Azure, GCP) and their data services
Familiarity with Docker and Kubernetes is a plus
Proficient with Git and CI/CD pipelines
Excellent problem-solving and analytical abilities
Strong communication and interpersonal skills
Ability to work effectively in a fast-paced, agile environment
Proactive and self-motivated with a strong sense of ownership

Nice to have:

Experience with real-time data streaming and processing using PySpark Structured Streaming
Knowledge of machine learning concepts and MLOps practices
Familiarity with data visualization tools (e.g., Tableau, Power BI)
Contributions to open-source data projects

What we offer:

Global workforce benefits
Support for well-being, career growth, and work-life balance

Additional Information:

Job Posted:
November 12, 2025

Employment Type:

Fulltime

Work Type:

Hybrid work

View All Jobs In This Company

Job Link Share:

Senior Data Engineer