Python and PySpark Developer Job at Citi (Chennai)

Job Description

We are seeking a motivated and detail‑oriented Python / PySpark Developer to support the development and maintenance of scalable data processing solutions. The ideal candidate should have foundational experience in Python and exposure to Apache Spark (PySpark), along with a strong willingness to learn and grow in a distributed data engineering environment. You will work under the guidance of senior engineers and collaborate with data teams to build reliable data pipelines and contribute to analytics and reporting solutions.

Job Responsibility

Assist in developing and maintaining data pipelines using Python and PySpark
Support ETL/ELT workflows for batch data processing
Write clean, readable, and well‑structured Python code following best practices
Perform basic data transformations, aggregations, and validations
Debug and troubleshoot pipeline issues with guidance from senior developers
Work with structured and semi‑structured data formats (CSV, JSON, Parquet, etc.)
Assist in integrating data from databases, APIs, and cloud storage systems
Help ensure data quality and consistency within pipelines
Support migration of legacy scripts to modern data platforms
Collaborate with team members on development tasks and code reviews
Participate in knowledge‑sharing and training sessions
Learn and adopt new tools, frameworks, and best practices
Assist in documenting data workflows and technical processes

Requirements

Basic to intermediate proficiency in Python
4 -7 years of experience
Exposure to Apache Spark / PySpark (internship or project experience is acceptable)
Understanding of fundamental programming and data structures
Basic knowledge of SQL and relational databases
Familiarity with data processing concepts and ETL fundamentals
Awareness of Linux/Unix command line is a plus
Understanding of coding best practices and version control (Git)
Basic debugging and problem‑solving skills
Exposure to unit testing concepts is a plus

Nice to have

Exposure to big data tools (Hive, Hadoop ecosystem, or similar)
Familiarity with cloud platforms (AWS / Azure / GCP)
Basic knowledge of job orchestration tools (Airflow, etc.)
Understanding of data pipelines and workflow lifecycle
Academic or project experience with data engineering or analytics

Citi - All Job Offers

Select Country

Python and PySpark Developer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Python and PySpark Developer

Data Engineering Python and Pyspark - Assistant Vice President

Python Developer (PySpark)

Python PySpark Developer

Big Data Pyspark Developer

Clinical Python Developer

Senior Python Developer

Python Developer - NLP, ML, Gen AI

Python Developer - NLP, ML, Gen AI

Our AI answers in your language