CrawlJobs Logo

Bigdata Developer with PySpark

India, Pune · Job Posted January 23, 2026
Apply Position
Job Link Share

Job Description

The Applications Development Intermediate Programmer Analyst is an intermediate level position responsible for participation in the establishment and implementation of new or revised application systems and programs in coordination with the Technology team. The overall objective of this role is to contribute to applications systems analysis and programming activities.

Job Responsibility

  • Develop and maintain data pipelines: Design, develop, and optimize scalable ETL (Extraction, Transformation, Loading) pipelines using PySpark to process large datasets
  • Coding and software engineering: Write clean, efficient, well-documented code primarily in Python (PySpark) and Java, often utilizing frameworks like Spring Boot
  • Collaboration and communication: Work with cross-functional teams, including senior developers, data engineers, analysts, and business partners, to understand data requirements and ensure seamless integration of solutions
  • Troubleshooting and optimization: Debug and resolve data processing issues and performance bottlenecks in Spark applications and other big data technologies
  • Full lifecycle involvement: Participate in the entire software development lifecycle (SDLC), from requirements analysis and design to testing, deployment, and operations, often using Agile/Scrum methodologies
  • Data integrity and quality: Ensure data quality and integrity throughout the data lifecycle

Requirements

  • 4-8 years of relevant experience in the Financial Service industry
  • Intermediate level experience in Applications Development role
  • Consistently demonstrates clear and concise written and verbal communication
  • Demonstrated problem-solving and decision-making skills
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor’s degree/University degree or equivalent experience
  • Strong understanding of Core Java and Object-Oriented Programming (OOP) concepts
  • Proficiency in Python, specifically for PySpark development
  • Hands-on experience or familiarity with Apache Spark (PySpark), Hadoop, and related ecosystem components like Hive and Sqoop
  • Basic knowledge of SQL and relational databases
  • Experience writing queries to validate and manipulate data
  • Familiarity with cloud services such as Amazon Web Services (AWS), Azure, or Google Cloud Platform (GCP)
  • Understanding of version control systems (e.g., Git)
  • Experience with development and testing tools (e.g., JIRA, Confluence)

Nice to have

  • Knowledge of distributed NoSQL databases (e.g., Elasticsearch, Cassandra, MongoDB) is a plus
  • Familiarity with DevOps practices and CI/CD pipelines is a bonus

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Bigdata Developer with PySpark

8 matching positions

Senior Bigdata Developer

The Data Analytics Senior Analyst is a seasoned professional role. Applies in-de...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Database Development & Architecture:Design, develop, and maintain complex database solutions across MongoDB, Oracle, and other SQL databases. Create optimal data models, schemas, and stored procedures to support high-throughput applications.
  • Data Pipeline Construction:Build and manage resilient, scalable ETL/ELT pipelines using Python to process and integrate large volumes of data from diverse source systems into our core data platforms.
  • Big Data Engineering:Engineer and implement solutions within our Big Data ecosystem (e.g., Hadoop, Spark, Hive, Kafka) to handle large-scale data processing, batch analytics, and real-time data streams.
  • Python Development:Write high-quality, production-ready Python code for data manipulation, API development, and automation. Utilize a range of libraries and frameworks relevant to data engineering (e.g., Pandas, PySpark, SQLAlchemy, PyMongo).
  • Performance Optimization:Proactively monitor, troubleshoot, and optimize the performance of our databases and data pipelines. Focus on query tuning, indexing strategies, and resource management to ensure low-latency data access.
  • Data Quality and Integrity:Implement data quality checks, validation rules, and monitoring frameworks within the data pipelines to ensure the accuracy, consistency, and reliability of our KYC data.
  • Collaboration:Work closely with application developers, data scientists, and data analysts to understand their data requirements and provide robust, well-documented data solutions and services.
  • Technical Leadership:Provide subject matter expertise on database and data engineering best practices. Mentor junior engineers and contribute to a culture of technical excellence.
  • Education:Bachelor’s/University degree or equivalent experience
Job Responsibility
Job Responsibility
  • Applies in-depth disciplinary knowledge, contributing to the development of new techniques and the improvement of processes and work-flows.
  • Coordinates and contribute to the objectives of data science initiatives and overall business through leveraging in-depth understanding of how areas collectively integrate within the sub-function.
  • Assumes informal/formal leadership role through coaching and training of new recruits.
  • Significantly influences decisions, work, and performance of all teams through advice, counsel and/or facilitating services to others in the business.
  • Conducts strategic data analysis, identifies insights and implications and make strategic recommendations, develops data displays that clearly communicate complex analysis.
  • Mines and analyzes data from various banking platforms to drive optimization and improve data quality.
  • Delivers analytics initiatives to address business problems with the ability to identify data required, assess time & effort required and establish a project plan.
  • Consults with business clients to identify system functional specifications. Applies comprehensive understanding of how multiple areas collectively integrate to contribute towards achieving business goals.
  • Consults with users and clients to solve complex system issues/problems through in-depth evaluation of business processes, systems and industry standards
  • recommends solutions.
  • Fulltime
Read More
Arrow Right

Senior Bigdata Cloud Developer

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Chennai
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7 - 9 years of relevant experience
  • Hands-on with Python, Pyspark, Scala, Kafka, Big data eco system & Unix scripting
  • Snowflake and Databricks experience will be added benefit
  • Strong exposure to cloud tech stack in AWS, & GCP
  • Hands-on with JIRA, CICD pipeline setup/usage
  • Experience in systems analysis and programming of software applications
  • Experience in managing and implementing successful projects
  • Working knowledge of consulting/project management techniques/methods
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor’s degree/University degree or equivalent experience
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
  • Fulltime
Read More
Arrow Right
New

Data Engineering Python and Pyspark - Assistant Vice President

Location
Location
India , Chennai; Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8 to 14 years of relevant experience in Data engineering role
  • Advanced SQL/ RDBMS skills and experience with relational databases and database design
  • Strong proficiency in object-oriented languages: Python, PySpark is must
  • Experience working with Bigdata - Hive/Impala/S3/HDFS
  • Experience working with data ingestion tools such as Talend or Ab Initio
  • Strong experience in data analysis, data exploration, and trend identification
  • Solid understanding of machine learning fundamentals like Regression, classification, clustering, Feature engineering
  • Strong proficiency in scripting languages like Bash, UNIX Shell scripting
  • Strong proficiency in data pipeline and workflow management tools
  • Strong project management and organizational skills
Job Responsibility
Job Responsibility
  • Build and maintain batch or real-time data pipelines in data platform
  • Maintain and optimize the data infrastructure required for accurate extraction, transformation, and loading of data from a wide variety of data sources
  • Develop ETL (extract, transform, load) processes to help extract and manipulate data from multiple sources
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Automate data workflows such as data ingestion, aggregation, and ETL processing
  • Analyze large datasets to identify trends, anomalies, and behavioral patterns
  • Apply machine learning and AI concepts (supervised / unsupervised learning) to support predictive and exploratory analysis
  • Perform feature engineering and data transformations to enable ML models
  • Support trend analysis, segmentation, clustering, and forecasting use cases
  • Interpret analytical results and translate them into business-friendly insights
  • Fulltime
Read More
Arrow Right

Engineering Director Data Engineering

Location
Location
India , Gurugram
Salary
Salary:
Not provided
https://www.randstad.com Logo
Randstad
Expiration Date
July 26, 2026
Flip Icon
Requirements
Requirements
  • Proficiency in Data Engineering
  • Strong knowledge of data pipeline development and data integration techniques
  • 5+ years of overall IT experience, which includes building & deployment of Bigdata applications using PySpark and AWS cloud
  • Extensive experience in design, build and deployment of Python-based applications
  • Hands-on experience in relational databases preferably Oracle and PostgreSQL and writing complex SQL queries
  • Experience in various AWS services such as EMR, API Gateway, RDS instance, and Lambda
  • Understand complex data sets and ETL processes, and how they can be optimized using Spark
  • Experience with cloud-based data platforms and distributed computing frameworks
  • Ability to optimize data workflows for performance and scalability
  • Familiarity with database management systems and data warehousing concepts
Job Responsibility
Job Responsibility
  • Expected to be an SME, collaborate and manage the team to perform
  • Responsible for team decisions
  • Engage with multiple teams and contribute on key decisions
  • Provide solutions to problems for their immediate team and across multiple teams
  • Lead the coordination of project activities to ensure timely delivery of software components
  • Mentor junior team members to support their professional growth and skill development
  • Facilitate effective communication between stakeholders to align technical solutions with business objectives
  • Collaborate with other teams to integrate Spark jobs into the overall data pipeline
  • Monitoring and tuning data loads and queries
Read More
Arrow Right

Senior Java -Spark-Bigdata Engineer-Assistant Vice President

The Applications Development Senior Programmer Analyst is a senior-level positio...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7-10 years of relevant experience in Data Engineering or a similar role, preferably within the Financial Services industry
  • Senior-level experience in an Applications Development or Data Engineering role
  • Consistently demonstrates clear and concise written and verbal communication
  • Demonstrated problem-solving and decision-making skills
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor's degree/University degree or equivalent experience
  • Hands-on expertise in Java (8+), Spring Boot, Python, and PySpark for building high-performance data applications
  • Extensive experience with the BigData ecosystem, including Apache Spark for large-scale data processing
  • Solid understanding of Data Warehouse concepts, design principles, and best practices
  • Strong proficiency with both relational SQL databases and NoSQL databases (e.g., MongoDB, Couchbase)
Job Responsibility
Job Responsibility
  • Utilize expert knowledge of data engineering principles, big data technologies, and software development best practices to design and implement robust data solutions
  • Collaborate with business stakeholders, data scientists, and other technology teams to understand data requirements and deliver effective solutions
  • Apply deep expertise in programming languages like Python and Java for building high-performance data processing applications
  • Ensure data solutions are secure, scalable, and adhere to the firm's security and architectural standards
  • Mentor and guide junior engineers, fostering a culture of technical excellence and continuous learning
  • Lead the analysis of complex data-related issues, identify root causes, and implement sustainable solutions
  • Operate with a high degree of autonomy and independence, exercising sound judgment and decision-making
  • Act as a Subject Matter Expert (SME) in big data technologies for senior stakeholders and other team members
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Lead Bigdata Engineer

The Data Analytics Lead Analyst is a strategic professional who stays abreast of...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12 - 15 years experience using codes for statistical modeling of large data sets
  • Strategic Data Analysis:Lead complex Bigdata analysis initiatives to identify patterns, anomalies, and opportunities for process improvement within the KYC lifecycle
  • Should also have strong experience on Python, Pyspark along with data governance
  • Insight Generation:Analyze large-scale customer and transactional datasets to generate actionable insights that enhance risk assessment models, improve operational efficiency, and strengthen compliance controls
  • Stakeholder Collaboration:Partner with senior stakeholders across Compliance, Technology, and Operations to understand business challenges, define analytical requirements, and present data-driven recommendations
  • Data-Driven Strategy:Play a key role in defining the data strategy for the KYC modernization program, including data sourcing, quality standards, and governance frameworks
  • Metrics & Reporting:Design, develop, and maintain advanced dashboards and reports to monitor Key Performance Indicators (KPIs) and Key Risk Indicators (KRIs). Provide regular updates to senior leadership on the effectiveness of KYC processes and the progress of the modernization project
  • Data Quality & Governance:Establish and oversee data quality frameworks to ensure the accuracy, completeness, and integrity of KYC data. Lead efforts to remediate data quality issues at their source
  • Advanced Analytics:Utilize statistical techniques and advanced analytical methodologies to develop predictive models for customer risk segmentation and to identify emerging financial crime typologies
  • Mentorship:Act as a subject matter expert and mentor for junior analysts, fostering a culture of analytical excellence and continuous learning within the team
Job Responsibility
Job Responsibility
  • Integrates subject matter and industry expertise within a defined area
  • Contributes to data analytics standards around which others will operate
  • Applies in-depth understanding of how data analytics collectively integrate within the sub-function as well as coordinate and contribute to the objectives of the entire function
  • Employs developed communication and diplomacy skills are required in order to guide, influence and convince others, in particular colleagues in other areas and occasional external customers
  • Resolves occasionally complex and highly variable issues
  • Produces detailed analysis of issues where the best course of action is not evident from the information available, but actions must be recommended/ taken
  • Responsible for volume, quality, timeliness and delivery of data science projects along with short-term planning resource planning
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Technology Analyst - Native Hana

As an expert SAP Native HANA with ABAP and UI5 knowledge he/she should understan...
Location
Location
India , Bangalore Area
Salary
Salary:
Not provided
airbus.com Logo
Airbus
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Graduate with 5-7 years of experience at least in Native HANA Development(HANA XSA)
  • Good experience in Developing HANA Objects ( Table Functions , SQL, CV, Database Procedure, ...)
  • Job Scheduling and Performance Monitoring on SAP HANA. Good in performance analysis
  • Data visualization (SAC , Qlik…)
  • Knowledge on SLT/SDA
  • Architecture global knowledge: SAP backbone, data exchange patterns
  • Development knowledge: pyspark, python, SQL, java
  • Knowledge with Palantir Foundry platform
  • Experience with Foundry platform, building and deploying solutions using Data Pipelines, Data Injestion Authoring and other related Foundry tools
  • Foundry: Pipeline engineering tools and best practices (code repositories, pipeline builder, ontology manager, markings)
Job Responsibility
Job Responsibility
  • Understand business processes and technical architecture of Airbus SAP landscape
  • Define/suggest SAP best practices and Golden rules
  • Perform deep technology root cause analysis and performance improvement
  • Perform source code auditing
  • Performance Tuning – ability to analyse and fine tune existing programs
  • Job Monitoring
  • Develop the solution as per the requirements with good quality and performance
  • Apply SAP & Hana best practices and Golden rules and architect recommendation
  • Propose & develop the right ergonomic solution as per the requirements with good quality and performance
  • Fulltime
Read More
Arrow Right

Lead Bigdata Analyst

The Data Analytics Lead Analyst is a strategic professional who stays abreast of...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 12 - 15 years experience using codes for statistical modeling of large data sets
  • Strategic Data Analysis:Lead complex Bigdata analysis initiatives to identify patterns, anomalies, and opportunities for process improvement within the KYC lifecycle
  • Strong experience on Python, Pyspark with data governance
  • Insight Generation:Analyze large-scale customer and transactional datasets to generate actionable insights
  • Stakeholder Collaboration:Partner with senior stakeholders across Compliance, Technology, and Operations
  • Data-Driven Strategy:Play a key role in defining the data strategy for the KYC modernization program
  • Metrics & Reporting:Design, develop, and maintain advanced dashboards and reports to monitor Key Performance Indicators (KPIs) and Key Risk Indicators (KRIs)
  • Data Quality & Governance:Establish and oversee data quality frameworks
  • Advanced Analytics:Utilize statistical techniques and advanced analytical methodologies to develop predictive models
  • Mentorship:Act as a subject matter expert and mentor for junior analysts
Job Responsibility
Job Responsibility
  • Integrates subject matter and industry expertise within a defined area
  • Contributes to data analytics standards around which others will operate
  • Applies in-depth understanding of how data analytics collectively integrate within the sub-function
  • Employs developed communication and diplomacy skills to guide, influence and convince others
  • Resolves occasionally complex and highly variable issues
  • Produces detailed analysis of issues where the best course of action is not evident
  • Responsible for volume, quality, timeliness and delivery of data science projects along with short-term planning resource planning
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup
  • Fulltime
Read More
Arrow Right