CrawlJobs Logo

PySpark Big Data Developer

India, Pune · Job Posted April 23, 2026
Apply Position
Job Link Share

Job Description

Market sales Technology is going through several transformational technology initiatives and is looking for Automation Tester to join the global Testing Center of Excellence. This role is based in Pune but it is part of a virtual team with other team members based in the US and India.

Job Responsibility

  • Work with a cross-functional and geographically dispersed team for quality assurance
  • Perform end to end features development of application including requirements understanding, impact analysis, development and execution, production deployment and maintenance
  • Work with multiple teams to develop, and execute features
  • Work with teams in a collaborative style to engage partners and proactively manage activities

Requirements

  • At least 4+ years’ experience in Python, SQL and PySpark
  • Proficiency in distributed data processing and big data tools and technologies: Hadoop, HDFS, YARN, Spark, Hive
  • API's and backend development (FASTAPI / FLASK)
  • Devops: GIT, CI/CD basics, Linux/Unix commands
  • Design, develop, and maintain big data pipelines using PySpark
  • Experience with integration of data from multiple data sources
  • Experience in DevTools like openshift, teamcity, uDeploy, BitBucket, GitHub
  • Proactively contribute to stability of overall production systems and troubleshoot key components and processes
  • Keep track of latest technology trends and proactively learn new technologies driving practical and reference implementations
  • Bachelor’s degree (preferably in technology /engineering or related field)

Nice to have

  • Experience with NoSQL databases, such as ElasticSearch
  • Understand of LLMs, prompt engineering, Data query systems (NL to SQL), basic knowledge of vector databases (FAISS, pinecone etc)

What we offer

  • Opportunity for professional development in the international and multicultural organization
  • Unique opportunity to participate in global investment banking projects
  • Internal and external trainings
  • Developing opportunities and challenging assignments
  • Attractive and stable employment conditions
  • Social benefits (medical care, Benefit System, life insurance, pension scheme)
  • Flexible working hours

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

PySpark Big Data Developer

8 matching positions

New

PySpark Big Data Developer

The Applications Development Intermediate Programmer Analyst is an intermediate ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2-5 years of relevant experience in the Financial Service industry
  • Intermediate level experience in Applications Development role
  • Consistently demonstrates clear and concise written and verbal communication
  • Demonstrated problem-solving and decision-making skills
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor’s degree/University degree or equivalent experience
  • Enterprise Application Development: 6-8 years in developing and managing enterprise-grade applications
  • Object-Oriented Programming (OOP): Solid foundation in OOP concepts
  • Big Data Development: Expertise in PySpark, HDFS, Hive, Sqoop, and Hadoop for Big Data environments
  • Database Technologies: Good exposure to SQL Server and ORACLE databases
Job Responsibility
Job Responsibility
  • Utilize knowledge of applications development procedures and concepts, and basic knowledge of other technical areas to identify and define necessary system enhancements, including using script tools and analyzing/interpreting code
  • Consult with users, clients, and other technology groups on issues, and recommend programming solutions, install, and support customer exposure systems
  • Apply fundamental knowledge of programming languages for design specifications
  • Analyze applications to identify vulnerabilities and security issues, as well as conduct testing and debugging
  • Serve as advisor or coach to new or lower level analysts
  • Identify problems, analyze information, and make evaluative judgements to recommend and implement solutions
  • Resolve issues by identifying and selecting solutions through the applications of acquired technical experience and guided by precedents
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
  • Fulltime
Read More
Arrow Right

Senior Big Data Pyspark Developer

We are looking for a skilled and motivated Full Stack Developer to join our engi...
Location
Location
Canada , Mississauga
Salary
Salary:
94300.00 - 141500.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5-6 years of professional software development experience
  • Proficiency in Java (including modern Java features)
  • Strong experience with Node.js
  • Strong experience with Angular (versions 2+)
  • Strong experience with Spring Boot and Spring MVC for building web applications and microservices
  • Proven experience with Microservices architecture design and implementation
  • Strong experience with Hibernate
  • Solid command of Oracle Database, including SQL and PL/SQL
  • Experience with MongoDB for NoSQL data management
  • Experience with caching mechanisms and technologies like Hazelcast
Job Responsibility
Job Responsibility
  • Contribute to the design, development, and implementation of robust software solutions, ensuring performance, scalability, and security
  • Collaborate with product managers, architects, and senior developers to translate business requirements into technical specifications and develop innovative solutions
  • Develop and maintain back-end services using Java, Spring Boot, Spring MVC, Node.js, and Microservices architecture
  • Build responsive and intuitive user interfaces using Angular
  • Design and manage databases, working with both relational (Oracle) and NoSQL (MongoDB) data stores, leveraging Hibernate for ORM
  • Implement caching strategies using technologies like Hazelcast to improve application performance
  • Implement event-driven architectures and data streaming solutions using Kafka
  • Develop and consume GraphQL APIs, ensuring efficient data exchange between front-end and back-end systems
  • Adhere to best practices in software development, including participating in code reviews, testing, continuous integration, and continuous deployment (CI/CD)
  • Actively learn from and contribute to the team, sharing knowledge and helping to maintain high technical standards
  • Fulltime
Read More
Arrow Right

Pyspark Big Data Senior Developer - Vice President

We are building an A-team of highly skilled and autonomous engineers, and we are...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of extensive, hands-on experience as a Senior Big Data Developer, with a strong emphasis on PySpark and the Apache Spark ecosystem, operating as a player/coach
  • Expert proficiency in Python, with a proven track record of developing robust, scalable, and high-performance PySpark applications for large-scale data processing
  • Deep understanding and extensive hands-on experience with Apache Spark (Spark Core, Spark SQL, Spark Streaming) and its ecosystem
  • Experience with distributed computing frameworks such as Hadoop (HDFS, YARN)
  • Expert proficiency in SQL and extensive experience with data warehousing concepts and technologies (e.g., Hive, Snowflake, Redshift, Databricks SQL)
  • Proven experience with various data storage formats (e.g., Parquet, ORC, Avro) and data lake solutions (e.g., Delta Lake, Iceberg)
  • Experience with NoSQL databases (e.g., MongoDB, Cassandra, HBase) is a significant plus
  • Strong experience with Apache Kafka for building real-time data pipelines and event-driven architectures
  • Demonstrated experience with big data services on major cloud platforms (e.g., AWS EMR/Glue/Redshift, Azure Databricks/Data Factory/Synapse, GCP Dataflow/Dataproc/BigQuery) is highly desirable
  • Proven effectiveness with AI coding tools (e.g., Claude Code, Codex, Antigravity) is a mandatory requirement
Job Responsibility
Job Responsibility
  • Operate end-to-end in the design, development, and implementation of robust big data solutions, ensuring optimal performance, scalability, data quality, and security
  • Collaborate closely within small, co-located squads (4-7 person teams), fostering high communication and low coordination overhead, to translate complex business requirements into technical specifications for big data processing and analytical solutions
  • Act as a player/coach within the team, mentoring junior members and leading by example in the development of efficient and innovative big data architectures
  • Design, develop, and optimize large-scale data pipelines using PySpark for data ingestion, transformation, and aggregation, always with an eye towards efficiency and domain relevance
  • Implement and manage real-time data streaming and event-driven architectures using technologies like Apache Kafka
  • Design and implement sophisticated data warehousing solutions and dimensional models for efficient data storage and retrieval, ensuring alignment with business needs
  • Work with various distributed data storage technologies, including distributed file systems (e.g., HDFS, S3) and NoSQL databases (e.g., MongoDB, Cassandra), selecting the right tool for the right problem
  • Implement efficient data processing and storage strategies to optimize the performance and scalability of big data applications, with a strong focus on the 'why' behind the technology choices
  • Champion best practices in software development, including rigorous code reviews, implementing comprehensive testing, and supporting continuous integration and continuous deployment (CI/CD) pipelines
  • Demonstrate high autonomy and agency in driving projects forward, making informed decisions, and proactively identifying areas for improvement
  • Fulltime
Read More
Arrow Right

Big Data Developer

The Applications Development Intermediate Programmer Analyst is an intermediate ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum of 5 years of overall IT experience, with at least 3 years of hands-on experience in Big Data technologies
  • Proven experience working with large-scale, high-volume datasets in distributed environments
  • Strong proficiency in Hadoop ecosystem tools, including: HDFS (Hadoop Distributed File System) for data storage
  • Hive for data querying and warehousing
  • Sqoop for data ingestion from relational databases
  • Advanced knowledge of Apache Spark, including: Spark Core, Spark SQL, and Spark Streaming (preferred)
  • Performance tuning and optimization techniques (e.g., partitioning, caching, memory management)
  • Solid programming skills in Python and PySpark for data processing and pipeline development
  • Strong command of SQL for complex queries, data transformations, and performance tuning
  • Hands-on experience in data sourcing, ingestion, and extraction from multiple structured and unstructured data sources
Job Responsibility
Job Responsibility
  • Participation in the establishment and implementation of new or revised application systems and programs in coordination with the Technology team
  • Contribute to applications systems analysis and programming activities
  • Design, develop, and optimize large-scale data processing systems
  • Work closely with cross-functional teams to build efficient data pipelines, perform data analysis, and support business-critical financial solutions
  • Fulltime
Read More
Arrow Right

Big Data / PySpark Engineering Lead - Vice President

The Applications Development Technology Lead Analyst is a senior level position ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Highly experienced and skilled technical lead with 12+years of experience with software building and platform engineering
  • Experience in Data Engineering, focused on Big Data ecosystems
  • Knowledge in Hadoop, YARN, Hive, Impala, Spark, and Spark SQL with extensive high volume of data processing pipeline development
  • Programming Expert level and hand on experience in Python
  • Familiarity with data formats like Avro, Parquet, CSV, JSON
  • Hands-on experience in writing SQL queries
  • Highly experienced with Unix based operating systems and shell scripting
  • Experience with source code management tools such as Bitbucket, Git etc
  • Big Data Tech Proficiency and hands-on in Hadoop, Spark, Hive, Kafka, and NoSQL databases (MongoDB, HBase)
  • Experience working with query engines like Trino, Presto, Starburst
Job Responsibility
Job Responsibility
  • Design and implement scalable, fault-tolerant batch and real-time data processing pipelines
  • Develop robust data models and schema designs optimized for both performance and storage efficiency
  • Evaluate and integrate emerging tools and frameworks (e.g., Spark, Flink, Kafka) into the existing stack
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
  • Legacy Systems Decommissioning: Lead the strategic migration of data and logic from legacy platforms (e.g. on-premises SQL Servers) to a modern Data Lakehouse environment
  • ETL/ELT Transformation: Re-engineer existing stored procedures and complex legacy ETL jobs into scalable, distributed processing frameworks using Spark (Python) and Starburst/Trino
  • Validation & Parity Testing: Design and implement automated frameworks for Data Parity Testing to ensure 100% accuracy and consistency between legacy outputs and new big data results
  • Schema Evolution: Map and transform rigid, legacy relational schemas into flexible, high-performance formats optimized for the cloud (e.g., Parquet, Avro, or Iceberg)
  • Phased Cutover Management: Orchestrate a phased migration strategy (Parallel Run, Shadow Execution) to ensure zero downtime for downstream business applications and reporting tools
  • Fulltime
Read More
Arrow Right

Big Data Developer

The Big Data Developer is a senior level position responsible for establishing a...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of relevant experience in Big Data/Application Development or systems analysis roles, including building and operating production-grade data pipelines on Hadoop/Spark
  • Extensive experience in system analysis and in programming of big data applications and data platforms
  • Proven experience designing and managing Hadoop-based architectures, including cluster configuration, resource management (YARN), and ecosystem integration
  • Strong understanding and hands-on expertise with the Hadoop ecosystem: HDFS, YARN, MapReduce, Hive, HBase, and Spark
  • Strong hands-on and architectural knowledge of Python, PySpark, Unix/Linux, and SQL
  • Experience with data modeling, ETL processes, and data warehousing concepts and implementation
  • Experience implementing data security and governance (e.g., RBAC, encryption, data quality, data lineage, catalog)
  • Exposure to AI/ML lifecycle management, MLOps, and GenAI solution patterns and integration points
  • Experience with major cloud platforms—AWS, Azure, Google Cloud—and related big data services (e.g., EMR, HDInsight, Dataproc, Databricks)
  • Subject Matter Expert (SME) in at least one area of Big Data/Application Development (e.g., Spark performance tuning, Hive optimization, HBase administration, data security)
Job Responsibility
Job Responsibility
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals, and to identify and define necessary platform and system enhancements to deploy new data products and process improvements
  • Design and implement scalable and efficient Hadoop architecture solutions encompassing core ecosystem components, including HDFS, YARN, MapReduce, Hive, HBase, and Spark
  • Collaborate with data engineers, data scientists, and analytics stakeholders to understand data requirements and deliver robust, reliable pipelines and analytical datasets
  • Develop Spark/PySpark solutions to support near real-time data ingestion, analytics, and reporting, ensuring high performance and reliability
  • Optimize Hadoop and Spark clusters for performance and resource utilization, including capacity planning, tuning, and job orchestration best practices
  • Maintain and monitor Hadoop infrastructure to ensure high availability, reliability, and observability
  • implement proactive alerting, logging, and issue resolution
  • Implement and enforce data security and governance policies (e.g., access controls, encryption, data quality, lineage, and cataloging) across big data platforms
  • Troubleshoot and resolve issues across the Hadoop ecosystem (jobs, services, resource management), driving root-cause analysis and permanent fixes
  • Provide expertise in the area and advanced knowledge of applications programming, ensuring application and data solution design adheres to the overall architecture blueprint and cloud reference patterns
  • Fulltime
Read More
Arrow Right

Senior Big Data Developer

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8-14 years of relevant experience
  • Experience in systems analysis and programming of software applications
  • Experience in managing and implementing successful projects
  • Working knowledge of consulting/project management techniques/methods
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor’s degree/University degree or equivalent experience
  • Strong Object-Oriented Programming (OOP) concepts
  • proficient in Python (specifically for PySpark)
  • Extensive experience with Apache Spark (PySpark), Hadoop, and related components like Hive and Sqoop
  • skilled in writing shell scripts
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
  • Fulltime
Read More
Arrow Right

Senior Python Big Data Developer

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7 - 12 years of relevant experience
  • Experience in systems analysis and programming of software applications
  • Experience in managing and implementing successful projects
  • Working knowledge of consulting/project management techniques/methods
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor’s degree/University degree or equivalent experience
  • Strong expertise in Big Data technologies (Spark, Hadoop, Hive, Impala, Kafka, Scala, Cloudera)
  • Design, develop, and maintain robust and scalable data pipelines using Python, SQL, PySpark, and streaming technologies like Kafka
  • Strong SQL and NoSQL experience (Oracle, MongoDB, PostgreSQL) for data extraction, reconciliation, and transformation
  • Proficiency in Python and Shell scripting for data processing and automation
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
  • Fulltime
Read More
Arrow Right