CrawlJobs Logo

Pyspark Engineer

https://www.citi.com/ Logo

Citi

Location Icon

Location:
India , Haryana

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Discover your future at Citi. Working at Citi is far more than just a job. A career with us means joining a team of more than 230,000 dedicated people from around the globe. At Citi, you’ll have the opportunity to grow your career, give back to your community and make a real impact.

Requirements:

  • Engineering Degree with 1-2 years of experience in BigData systems, Hive, Hadoop, Spark (Python/ scala) and cloud based data management technologies
  • Hands-on experience in Unix Scripting, Python and Scala programing along with strong experience in SQL
  • Comfortable working with completed unstructured, undocumented code and turning it around into best in class code redesigning costly compute and data processes and aligning to best development standards
  • Experienced in working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding
  • Well versed with necessary data preprocessing and application engineering skills
  • At least 3 years of experience designing software systems with intense computational needs across real time and batch process
  • Experience and understanding of Supervised, unsupervised machine learning techniques
  • Knowledge of data management, data governance, data security and regulatory practices
  • Ability to identify, clearly articulate and solve complex business problems and present them to the management in a structured and simpler form
  • Should have experience of working in onsite, offsite delivery model
  • Experience working with large and multiple datasets, data warehouses and ability to pull data using relevant programs and coding
  • Experience in Credit Cards and Retail Banking
  • Should have excellent communication and inter-personal skills
  • Strong process/project management skills
  • Multiple stake holder management
  • Control orientated and Risk awareness
  • Fast Learner with a desire to excel and attitude to partner and solve problems in complex environments placing business objectives at center or all activity
  • Bachelors/University degree or equivalent experience

Nice to have:

  • Exposure to data ingestion, ETL tools such as Talend, modeling tools, Performance Management tooling such as Pepper data, Cloudera stack will be a plus
  • Experience in Performance Tuning, Code Re-engineering is preferred
  • Experience in broad IT architecture and design preferred across data and channels
  • Experience in query tuning, automation technologies (Autosys, Jenkins, Service Now) preferred
  • Exposure to container technology, Machine learning will be a plus

Additional Information:

Job Posted:
January 10, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Pyspark Engineer

Pyspark Data Engineer

The Data Analytics Intmd Analyst is a developing professional role. Deals with m...
Location
Location
India , Chennai
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4-8 years relevant experience in Data Analytics and Big Data
  • SQL, Python, Pyspark, with Spark components
  • Minimum 4 years of experience as a python developer with expertise in automation testing to design, develop, and automate robust software solutions and testing frameworks like Pytest, Behave
  • 2-4 years of experience as Big Data Engineer to develop, optimize, and manage large-scale data processing systems and analytics platforms
  • 4 years of experience in distributed data processing & near real-time data analytics using PySpark
  • Strong understanding of PySpark execution plans, partitioning & optimization techniques
Job Responsibility
Job Responsibility
  • Integrates in-depth data analysis knowledge with a solid understanding of industry standards and practices
  • Demonstrates a Good understanding of how data analytics teams and area integrate with others in accomplishing objectives
  • Applies project management skills
  • Applies analytical thinking and knowledge of data analysis tools and methodologies
  • Analyzes factual information to make accurate judgments and recommendations focused on local operations and broader impacts
  • Applies professional judgment when interpreting data and results breaking down information in a systematic and communicable manner
  • Employs developed communication and diplomacy skills to exchange potentially complex/sensitive information
  • Demonstrates attention to quality and timeliness of service to ensure the effectiveness of the team and group
  • Provides informal guidance or on-the-job-training to new team members
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right

Data Engineer

Join us as a Data Engineer responsible for supporting the successful delivery of...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands on experience in pyspark and strong knowledge on Dataframes, RDD and SparkSQL
  • Hands on Experience in developing, testing and maintaining applications on AWS Cloud
  • Strong hold on AWS Data Analytics Technology Stack (Glue, S3, Lambda, Lake formation, Athena)
  • Design and implement scalable and efficient data transformation/storage solutions using Snowflake
  • Experience in Data ingestion to Snowflake for different storage format such Parquet, Iceberg, JSON, CSV etc
  • Experience in using DBT (Data Build Tool) with snowflake for ELT pipeline development
  • Experience in Writing advanced SQL and PL SQL programs
  • Hands On Experience for building reusable components using Snowflake and AWS Tools/Technology
  • Should have worked at least on two major project implementations
Job Responsibility
Job Responsibility
  • Build and maintenance of data architectures pipelines that enable the transfer and processing of durable, complete and consistent data
  • Design and implementation of data warehoused and data lakes that manage the appropriate data volumes and velocity and adhere to the required security measures
  • Development of processing and analysis algorithms fit for the intended data complexity and volumes
  • Collaboration with data scientist to build and deploy machine learning models
What we offer
What we offer
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution
  • Fulltime
Read More
Arrow Right

Data Engineer

Join us as a Data Engineer responsible for supporting the successful delivery of...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands on experience in pyspark and strong knowledge on Dataframes, RDD and SparkSQL
  • Hands on Experience in developing, testing and maintaining applications on AWS Cloud
  • Strong hold on AWS Data Analytics Technology Stack (Glue, S3, Lambda, Lake formation, Athena)
  • Design and implement scalable and efficient data transformation/storage solutions using Snowflake
  • Experience in Data ingestion to Snowflake for different storage format such Parquet, Iceberg, JSON, CSV etc
  • Experience in using DBT (Data Build Tool) with snowflake for ELT pipeline development
  • Experience in Writing advanced SQL and PL SQL programs
  • Hands On Experience for building reusable components using Snowflake and AWS Tools/Technology
  • Should have worked at least on two major project implementations
Job Responsibility
Job Responsibility
  • Investigation and analysis of data issues related to quality, lineage, controls, and authoritative source identification, documenting data sources, methodologies, and quality findings with recommendations for improvement
  • Designing and building data pipelines to automate data movement and processing
  • Apply advanced analytical techniques to large datasets to uncover trends and correlations, develop validated logical data models, and translate insights into actionable business recommendations that drive operational and process improvements, leveraging machine learning/AI
  • Through data-driven analysis, translate analytical findings into actionable business recommendations, identifying opportunities for operational and process improvements
  • Design and create interactive dashboards and visual reports using applicable tools and automate reporting processes for regular and ad-hoc stakeholder needs
What we offer
What we offer
  • Wellness rooms
  • on-site cafeterias
  • fitness centers
  • tech-equipped workstations
  • Fulltime
Read More
Arrow Right

Data Engineer

A career in Data & Analytics at Barclays is a hub for top talent, from beginners...
Location
Location
India , Pune
Salary
Salary:
Not provided
barclays.co.uk Logo
Barclays
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hands on experience in pyspark and strong knowledge on Dataframes, RDD and SparkSQL
  • Hands on Experience in developing, testing and maintaining applications on AWS Cloud
  • Strong hold on AWS Data Analytics Technology Stack (Glue, S3, Lambda, Lake formation, Athena)
  • Design and implement scalable and efficient data transformation/storage solutions using Snowflake
  • Experience in Data ingestion to Snowflake for different storage format such Parquet, Iceberg, JSON, CSV etc
  • Experience in using DBT (Data Build Tool) with snowflake for ELT pipeline development
  • Experience in Writing advanced SQL and PL SQL programs
  • Hands On Experience for building reusable components using Snowflake and AWS Tools/Technology
  • Should have worked at least on two major project implementations
Job Responsibility
Job Responsibility
  • Support the successful delivery of Location Strategy projects to plan, budget, agreed quality and governance standards
  • Spearhead the evolution of our digital landscape, driving innovation and excellence
  • Harness cutting-edge technology to revolutionise our digital offerings, ensuring unparalleled customer experiences
  • Investigation and analysis of data issues related to quality, lineage, controls, and authoritative source identification, documenting data sources, methodologies, and quality findings with recommendations for improvement
  • Designing and building data pipelines to automate data movement and processing
  • Apply advanced analytical techniques to large datasets to uncover trends and correlations, develop validated logical data models, and translate insights into actionable business recommendations that drive operational and process improvements, leveraging machine learning/AI
  • Through data-driven analysis, translate analytical findings into actionable business recommendations, identifying opportunities for operational and process improvements
  • Design and create interactive dashboards and visual reports using applicable tools and automate reporting processes for regular and ad-hoc stakeholder needs
What we offer
What we offer
  • Hybrid working
  • Structured approach to hybrid working with fixed, ‘anchor’, days onsite
  • Supportive and inclusive culture and environment
  • Commitment to flexible working arrangements
  • International scale offering incredible variety, depth and breadth of experience
  • Chance to learn from a globally diverse mix of colleagues
  • Encouragement to embrace mobility and explore every part of operations
  • Fulltime
Read More
Arrow Right

Senior Big Data Engineer

The Big Data Engineer is a senior level position responsible for establishing an...
Location
Location
Canada , Mississauga
Salary
Salary:
94300.00 - 141500.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ Years of Experience in Big Data Engineering (PySpark)
  • Data Pipeline Development: Design, build, and maintain scalable ETL/ELT pipelines to ingest, transform, and load data from multiple sources
  • Big Data Infrastructure: Develop and manage large-scale data processing systems using frameworks like Apache Spark, Hadoop, and Kafka
  • Proficiency in programming languages like Python, or Scala
  • Strong expertise in data processing frameworks such as Apache Spark, Hadoop
  • Expertise in Data Lakehouse technologies (Apache Iceberg, Apache Hudi, Trino)
  • Experience with cloud data platforms like AWS (Glue, EMR, Redshift), Azure (Synapse), or GCP (BigQuery)
  • Expertise in SQL and database technologies (e.g., Oracle, PostgreSQL, etc.)
  • Experience with data orchestration tools like Apache Airflow or Prefect
  • Familiarity with containerization (Docker, Kubernetes) is a plus
Job Responsibility
Job Responsibility
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
  • Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
  • Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
  • Appropriately assess risk when business decisions are made, demonstrating consideration for the firm's reputation and safeguarding Citigroup, its clients and assets
  • Fulltime
Read More
Arrow Right

Senior Big Data Engineer

The Big Data Engineer is a senior level position responsible for establishing an...
Location
Location
Canada , Mississauga
Salary
Salary:
94300.00 - 141500.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ Years of Experience in Big Data Engineering (PySpark)
  • Data Pipeline Development: Design, build, and maintain scalable ETL/ELT pipelines to ingest, transform, and load data from multiple sources
  • Big Data Infrastructure: Develop and manage large-scale data processing systems using frameworks like Apache Spark, Hadoop, and Kafka
  • Proficiency in programming languages like Python, or Scala
  • Strong expertise in data processing frameworks such as Apache Spark, Hadoop
  • Expertise in Data Lakehouse technologies (Apache Iceberg, Apache Hudi, Trino)
  • Experience with cloud data platforms like AWS (Glue, EMR, Redshift), Azure (Synapse), or GCP (BigQuery)
  • Expertise in SQL and database technologies (e.g., Oracle, PostgreSQL, etc.)
  • Experience with data orchestration tools like Apache Airflow or Prefect
  • Familiarity with containerization (Docker, Kubernetes) is a plus
Job Responsibility
Job Responsibility
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
  • Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
  • Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
  • Appropriately assess risk when business decisions are made, demonstrating consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
What we offer
What we offer
  • Well-being support
  • Growth opportunities
  • Work-life balance support
  • Fulltime
Read More
Arrow Right

Backend Data Engineer

The mission of the Data & Analytics (D&A) team is to enable data users to easily...
Location
Location
United States , Cincinnati
Salary
Salary:
Not provided
honorvettech.com Logo
HonorVet Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong proficiency in Databricks (SQL, PySpark, Delta Lake, Jobs/Workflows)
  • Deep knowledge of Unity Catalog administration and APIs
  • Expertise in Python for automation scripts, API integrations, and data quality checks
  • Experience with governance frameworks (access control, tagging enforcement, lineage, compliance)
  • Solid foundation in security & compliance best practices (IAM, encryption, PII)
  • Experience with CI/CD and deployment pipelines (GitHub Actions, Azure DevOps, Jenkins)
  • Familiarity with monitoring/observability tools and building custom logging & alerting pipelines
  • Experience integrating with external systems (ServiceNow, monitoring platforms)
  • Experience with modern data quality frameworks (Great Expectations, Deequ, or equivalent)
  • Strong problem-solving and debugging skills in distributed systems
Job Responsibility
Job Responsibility
  • Databricks & Unity Catalog Engineering: Build and maintain backend services leveraging Databricks (SQL, PySpark, Delta Lake, Jobs/Workflows)
  • Administer Unity Catalog including metadata, permissions, lineage, and tags
  • Integrate Unity Catalog APIs to surface data into the Metadata Catalog UI
  • Governance Automation: Develop automation scripts and pipelines to enforce access controls, tagging, and role-based policies
  • Implement governance workflows integrating with tools such as ServiceNow for request and approval processes
  • Automate compliance checks for regulatory and security requirements (IAM, PII handling, encryption)
  • Data Quality & Observability: Implement data quality frameworks (Great Expectations, Deequ, or equivalent) to validate datasets
  • Build monitoring and observability pipelines for logging, usage metrics, audit trails, and alerts
  • Ensure high system reliability and proactive issue detection
  • API Development & Integration: Design and implement APIs to integrate Databricks services with external platforms (ServiceNow, monitoring tools)
Read More
Arrow Right

Lead Data Engineer

We are seeking an experienced Senior Data Engineer to lead the development of a ...
Location
Location
India , Kochi; Trivandrum
Salary
Salary:
Not provided
experionglobal.com Logo
Experion Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years experience in data engineering with analytical platform development focus
  • Proficiency in Python and/or PySpark
  • Strong SQL skills for ETL processes and large-scale data manipulation
  • Extensive AWS experience (Glue, Lambda, Step Functions, S3)
  • Familiarity with big data systems (AWS EMR, Apache Spark, Apache Iceberg)
  • Database experience with DynamoDB, Aurora, Postgres, or Redshift
  • Proven experience designing and implementing RESTful APIs
  • Hands-on CI/CD pipeline experience (preferably GitLab)
  • Agile development methodology experience
  • Strong problem-solving abilities and attention to detail
Job Responsibility
Job Responsibility
  • Architect, develop, and maintain end-to-end data ingestion framework for extracting, transforming, and loading data from diverse sources
  • Use AWS services (Glue, Lambda, EMR, ECS, EC2, Step Functions) to build scalable, resilient automated data pipelines
  • Develop and implement automated data quality checks, validation routines, and error-handling mechanisms
  • Establish comprehensive monitoring, logging, and alerting systems for data quality issues
  • Architect and develop secure, high-performance APIs for data services integration
  • Create thorough API documentation and establish standards for security, versioning, and performance
  • Work with business stakeholders, data scientists, and operations teams to understand requirements
  • Participate in sprint planning, code reviews, and agile ceremonies
  • Contribute to CI/CD pipeline development using GitLab
Read More
Arrow Right