Filters

Countries

India (4)

Cities

Work Mode

Pyspark Developer India Jobs

4 Job Offers

Filters

Pyspark Developer

Seeking an experienced PySpark Developer in Chennai to build and enhance financial data systems. This role requires 5-8 years in Python, plus 3 years each with PySpark and Kafka for robust data pipelines. Ideal candidates will have a strong background in data analysis, application security, and m...

Location

India , Chennai

Salary

Not provided

Citi

Expiration Date

Until further notice

Bigdata Java Spark And Pyspark Developer

Seeking a Senior Big Data Developer with 5-8 years' experience in Java, Spark, and PySpark. Lead medium-scale projects in Pune, driving application design and development in an agile environment. Strong analytical skills and experience with infrastructure programs are essential. This role offers ...

Location

India , Pune

Salary

Not provided

Citi

Expiration Date

Until further notice

Pyspark Developer

Join Citi in Pune as a Pyspark Developer. This intermediate role involves application systems analysis, programming, and implementing enhancements. Utilize your 0-2 years of experience in programming, debugging, and knowledge of industry standards. Contribute to complex projects with a focus on P...

Location

India , Pune

Salary

Not provided

Citi

Expiration Date

Until further notice

New

Bigdata Developer with PySpark

Join Citi in Pune as a Big Data Developer. Utilize your 4-8 years of financial services experience to design and optimize PySpark ETL pipelines. Your expertise in Python, Java, and Hadoop will be key in developing scalable data solutions. This role offers involvement in the full development lifec...

Location

India , Pune

Salary

Not provided

Citi

Expiration Date

Until further notice

A PySpark Developer is a specialized data engineering professional at the forefront of the big data revolution. These experts are the architects and builders of large-scale, distributed data processing systems, leveraging the power of Apache Spark and the Python programming language. For professionals seeking challenging and impactful roles, PySpark Developer jobs are central to modern data-driven enterprises, enabling organizations to transform vast, unstructured data into actionable insights and intelligence. This role sits at the critical intersection of software engineering, data science, and business analytics, making it a highly sought-after and rewarding career path. Typically, a PySpark Developer's core responsibility revolves around designing, constructing, testing, and maintaining robust data pipelines. These pipelines are the lifelines of an organization's data infrastructure, responsible for efficiently ingesting, cleansing, transforming, and aggregating massive datasets from diverse sources. A significant part of their day-to-day work involves writing, optimizing, and debugging complex PySpark code to perform Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT) processes. They are deeply involved in performance tuning, which includes optimizing Spark execution plans, managing data partitioning strategies, and caching data in-memory to ensure processing jobs run efficiently and cost-effectively. Furthermore, these developers collaborate closely with data scientists, business analysts, and other stakeholders to understand data requirements and implement solutions that support advanced analytics, machine learning models, and business intelligence reporting. To excel in PySpark Developer jobs, a specific and robust skill set is required. Mastery of the PySpark framework is, of course, fundamental, including a deep understanding of its core concepts like Resilient Distributed Datasets (RDDs), DataFrames, and Datasets. Strong proficiency in Python is essential for writing clean, efficient, and maintainable code, often utilizing libraries like Pandas. An expert-level command of SQL is non-negotiable for complex data querying and manipulation. Beyond these core technical skills, a strong grasp of big data ecosystem tools is highly valued; familiarity with cloud platforms like AWS, Azure, or GCP, along with technologies like Kafka for real-time data streaming and Hadoop for distributed storage, is increasingly common. Understanding data warehousing concepts and data modeling techniques is also crucial for building scalable and organized data solutions. While educational backgrounds often include a degree in computer science or a related field, practical, hands-on experience in building and optimizing data pipelines is the true differentiator for candidates in this field. For those with a passion for solving complex data problems at scale, PySpark Developer jobs offer a dynamic and future-proof career building the data backbone of the digital world.