CrawlJobs Logo

Senior PySpark Developer - Vice President

https://www.citi.com/ Logo

Citi

Location Icon

Location:
United States , Tampa

Category Icon

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

113840.00 - 170760.00 USD / Year

Job Description:

We are seeking a highly skilled and experienced Senior PySpark Developer to join our dynamic technology team. This role requires an individual with deep expertise in Python, PySpark, Big Data technologies, and SQL, coupled with a strong ability to work independently and contribute significantly to complex data engineering initiatives. The ideal candidate will have a proven track record in designing, developing, and optimizing scalable data solutions, with experience in ETL processes and a keen interest in leveraging the latest technologies. Domain knowledge in Finance will be a significant advantage, enabling the candidate to contribute to critical financial crime compliance projects.

Job Responsibility:

  • Design, develop, and implement robust, scalable, and high-performance data pipelines and applications using Python, PySpark, and Big Data technologies
  • Work autonomously to analyze requirements, propose technical solutions, and deliver high-quality code and data products, ensuring alignment with architectural standards and business objectives
  • Utilize expertise in various Big Data platforms (e.g., Hadoop, Hive, Kafka, Spark) to process, transform, and manage large datasets efficiently
  • Write complex SQL queries, stored procedures, and optimize database performance for large-scale data warehousing and analytics solutions
  • Develop and enhance ETL (Extract, Transform, Load) processes, ensuring data quality, integrity, and timely delivery. Experience with various ETL tools and methodologies is a plus
  • Proactively research, evaluate, and integrate new and emerging technologies, frameworks, and tools to improve development processes and solution capabilities
  • Ensure adherence to coding standards, conduct thorough code reviews, and implement best practices for software development, data governance, and security
  • Diagnose and resolve complex technical issues related to data pipelines, performance bottlenecks, and system integrations in a fast-paced environment
  • Collaborate effectively with cross-functional teams including architects, data scientists, business analysts, and QA engineers. Provide technical guidance and mentorship to junior team members
  • Identify opportunities to use AI tools to speed up development, code reviews, unit testing and deployment.

Requirements:

  • 10+ years of experience in Applications Development, Systems Analysis, or equivalent senior engineering roles
  • Extensive hands‑on experience delivering enterprise‑scale, database‑driven platforms in a regulated environment
  • Expert-level proficiency in Python programming, including object-oriented design, data structures, algorithms, and extensive experience with various Python libraries
  • Deep expertise in developing, optimizing, and deploying PySpark applications for large-scale data processing, ETL, and real-time analytics on distributed systems (e.g., Spark SQL, Spark Streaming, DataFrames)
  • Strong understanding of Apache Spark architecture, Hadoop ecosystem, and experience with distributed computing concepts. Familiarity with big data storage formats (e.g., Parquet, ORC)
  • Solid experience with both relational databases (e.g., Oracle) and NoSQL databases (e.g., MongoDB). Strong SQL writing and optimization skills
  • Experience in designing, developing, and consuming RESTful APIs using Python frameworks (e.g., Flask, FastAPI, Django REST Framework)
  • Strong understanding and practical experience with CI/CD tools (e.g., Jenkins) and containerization technologies (Docker, Kubernetes)
  • Expert-level proficiency with Git
  • Experience with unit testing (e.g., Pytest), integration testing, and performance testing frameworks for Python and PySpark applications
  • Exposure to at least one major cloud provider (AWS, Azure, or GCP), specifically with their compute, storage, and data services (e.g., S3, ADLS, EMR, Databricks, Azure Synapse) preferred
  • Exposure to or direct experience with Artificial Intelligence (AI) and Machine Learning (ML) concepts, frameworks (e.g., TensorFlow, PyTorch), or relevant projects is a significant advantage
  • Exceptional analytical and problem-solving abilities, with a strong capacity to understand complex business needs and translate them into effective technical solutions
  • Excellent leadership, team management, and mentoring capabilities
  • Superior verbal and written communication skills, with the ability to articulate complex technical concepts clearly to both technical and non-technical audiences
  • Strong collaboration and interpersonal skills, with a proven ability to work effectively with cross-functional teams
  • Highly proactive, results-oriented, and a strong commitment to delivering high-quality, innovative solutions
  • Ability to thrive and lead in an agile, dynamic, and fast-paced work environment
  • Bachelor’s degree/University degree or equivalent experience
  • Master’s degree preferred

Nice to have:

  • Domain knowledge in Finance
  • Exposure to at least one major cloud provider (AWS, Azure, or GCP), specifically with their compute, storage, and data services (e.g., S3, ADLS, EMR, Databricks, Azure Synapse)
  • Exposure to or direct experience with Artificial Intelligence (AI) and Machine Learning (ML) concepts, frameworks (e.g., TensorFlow, PyTorch), or relevant projects
  • Master’s degree
What we offer:
  • medical, dental & vision coverage
  • 401(k)
  • life, accident, and disability insurance
  • wellness programs
  • paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays
  • discretionary and formulaic incentive and retention awards

Additional Information:

Job Posted:
May 16, 2026

Expiration:
June 05, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior PySpark Developer - Vice President

Pyspark Big Data Senior Developer - Vice President

We are building an A-team of highly skilled and autonomous engineers, and we are...
Location
Location
Canada , Mississauga
Salary
Salary:
120800.00 - 170800.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of extensive, hands-on experience as a Senior Big Data Developer, with a strong emphasis on PySpark and the Apache Spark ecosystem, operating as a player/coach
  • Expert proficiency in Python, with a proven track record of developing robust, scalable, and high-performance PySpark applications for large-scale data processing
  • Deep understanding and extensive hands-on experience with Apache Spark (Spark Core, Spark SQL, Spark Streaming) and its ecosystem
  • Experience with distributed computing frameworks such as Hadoop (HDFS, YARN)
  • Expert proficiency in SQL and extensive experience with data warehousing concepts and technologies (e.g., Hive, Snowflake, Redshift, Databricks SQL)
  • Proven experience with various data storage formats (e.g., Parquet, ORC, Avro) and data lake solutions (e.g., Delta Lake, Iceberg)
  • Experience with NoSQL databases (e.g., MongoDB, Cassandra, HBase) is a significant plus
  • Strong experience with Apache Kafka for building real-time data pipelines and event-driven architectures
  • Demonstrated experience with big data services on major cloud platforms (e.g., AWS EMR/Glue/Redshift, Azure Databricks/Data Factory/Synapse, GCP Dataflow/Dataproc/BigQuery) is highly desirable
  • Proven effectiveness with AI coding tools (e.g., Claude Code, Codex, Antigravity) is a mandatory requirement
Job Responsibility
Job Responsibility
  • Operate end-to-end in the design, development, and implementation of robust big data solutions, ensuring optimal performance, scalability, data quality, and security
  • Collaborate closely within small, co-located squads (4-7 person teams), fostering high communication and low coordination overhead, to translate complex business requirements into technical specifications for big data processing and analytical solutions
  • Act as a player/coach within the team, mentoring junior members and leading by example in the development of efficient and innovative big data architectures
  • Design, develop, and optimize large-scale data pipelines using PySpark for data ingestion, transformation, and aggregation, always with an eye towards efficiency and domain relevance
  • Implement and manage real-time data streaming and event-driven architectures using technologies like Apache Kafka
  • Design and implement sophisticated data warehousing solutions and dimensional models for efficient data storage and retrieval, ensuring alignment with business needs
  • Work with various distributed data storage technologies, including distributed file systems (e.g., HDFS, S3) and NoSQL databases (e.g., MongoDB, Cassandra), selecting the right tool for the right problem
  • Implement efficient data processing and storage strategies to optimize the performance and scalability of big data applications, with a strong focus on the 'why' behind the technology choices
  • Champion best practices in software development, including rigorous code reviews, implementing comprehensive testing, and supporting continuous integration and continuous deployment (CI/CD) pipelines
  • Demonstrate high autonomy and agency in driving projects forward, making informed decisions, and proactively identifying areas for improvement
  • Fulltime
Read More
Arrow Right
New

Senior Data Software Engineer (Python & PySpark) - Vice President

The Senior Data Software Engineer is a senior level position responsible for est...
Location
Location
Singapore , Singapore
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field
  • 7+ years of experience in data engineering, with a strong focus on Python and big data technologies
  • Proven expertise in designing and implementing large-scale data processing solutions using PySpark
  • Extensive experience with distributed computing frameworks like Apache Spark
  • Strong understanding of data warehousing concepts, dimensional modeling, and ETL/ELT principles
  • Proficiency in SQL and experience with various relational and NoSQL databases
  • Experience with cloud platforms (AWS, Azure, GCP) and their data services (e.g., S3, ADLS, Google Cloud Storage, Redshift, Snowflake, BigQuery, Databricks)
  • Familiarity with workflow orchestration tools (e.g., Apache Airflow, Azure Data Factory, AWS Step Functions)
  • Experience with version control systems (e.g., Git)
  • Excellent problem-solving, analytical, and communication skills.
Job Responsibility
Job Responsibility
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
  • Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
  • Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.
  • Fulltime
Read More
Arrow Right

Apps Dev Tech Lead Analyst - Vice President

As a key member of our global development team, you will: Innovate & Develop: Pa...
Location
Location
United States , Irving
Salary
Salary:
125760.00 - 188640.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6-10 years of progressive experience in systems analysis and programming of software applications
  • Strong proficiency in Java application technologies, including deep experience with TDD (Test-Driven Development), Spring framework, and Microservices architecture
  • Extensive hands-on experience with PySpark and advanced Python programming skills
  • Proven experience with Big Data ecosystems, including Cloudera and/or Data Bricks
  • Hands-on experience with distributed query engines like Starburst (Trino/Presto)
  • Proficient in designing and managing complex workflows using scheduling tools, particularly Apache Airflow
  • Strong expertise in SQL and experience with relational and non-relational databases
  • Excellent knowledge of algorithms and data structures, design patterns
  • Strong Java experience: Java core, collections, concurrency, streams
  • Frameworks and APIs: Spring (Core, Batch, Integration, MVC, Boot, Data), Hibernate, Jackson, JAX RS, JPA, JAXB
Job Responsibility
Job Responsibility
  • Innovate & Develop: Partner closely with project managers, business stakeholders, and senior managers to translate complex business requirements into well-architected technical solutions
  • Drive cross-functional collaboration with diverse management teams
  • Proactively identify, define, and implement necessary system enhancements
  • Complex Problem Resolution: Lead the resolution of high-impact problems and critical projects
  • Consult with users, clients, and other technology groups on issues
  • Technical Architecture & Standards Leadership: Serve as a subject matter expert in application programming
  • Leverage an advanced understanding of system flow to develop and enforce robust standards for coding, testing, debugging, and implementation
  • Mentorship & Talent Development: Act as a trusted advisor and coach for mid-level developers and analysts
  • Provide technical guidance, mentorship, and code reviews to junior data engineers
  • Operational Excellence: Ensure adherence to best practices and essential procedures
What we offer
What we offer
  • medical, dental & vision coverage
  • 401(k)
  • life, accident, and disability insurance
  • wellness programs
  • paid time off packages including planned time off (vacation), unplanned time off (sick leave), and paid holidays
  • Fulltime
Read More
Arrow Right

Senior Java -Spark-Bigdata Engineer-Assistant Vice President

The Applications Development Senior Programmer Analyst is a senior-level positio...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7-10 years of relevant experience in Data Engineering or a similar role, preferably within the Financial Services industry
  • Senior-level experience in an Applications Development or Data Engineering role
  • Consistently demonstrates clear and concise written and verbal communication
  • Demonstrated problem-solving and decision-making skills
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor's degree/University degree or equivalent experience
  • Hands-on expertise in Java (8+), Spring Boot, Python, and PySpark for building high-performance data applications
  • Extensive experience with the BigData ecosystem, including Apache Spark for large-scale data processing
  • Solid understanding of Data Warehouse concepts, design principles, and best practices
  • Strong proficiency with both relational SQL databases and NoSQL databases (e.g., MongoDB, Couchbase)
Job Responsibility
Job Responsibility
  • Utilize expert knowledge of data engineering principles, big data technologies, and software development best practices to design and implement robust data solutions
  • Collaborate with business stakeholders, data scientists, and other technology teams to understand data requirements and deliver effective solutions
  • Apply deep expertise in programming languages like Python and Java for building high-performance data processing applications
  • Ensure data solutions are secure, scalable, and adhere to the firm's security and architectural standards
  • Mentor and guide junior engineers, fostering a culture of technical excellence and continuous learning
  • Lead the analysis of complex data-related issues, identify root causes, and implement sustainable solutions
  • Operate with a high degree of autonomy and independence, exercising sound judgment and decision-making
  • Act as a Subject Matter Expert (SME) in big data technologies for senior stakeholders and other team members
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Fulltime
Read More
Arrow Right
New

Senior Data Engineer - Vice President

Citi is seeking a highly skilled and experienced Senior Data Engineer to join ou...
Location
Location
United States , Irving
Salary
Salary:
125760.00 - 188640.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
May 18, 2026
Flip Icon
Requirements
Requirements
  • Expert-level proficiency with Python and its data ecosystem (e.g., Pandas, NumPy, Dask). Extensive hands-on experience with the Spark framework, including deep knowledge of the DataFrame API, Spark SQL, and performance tuning techniques for distributed data processing
  • Proven experience developing on the Databricks Lakehouse Platform, including proficiency with Delta Lake, structured streaming, and optimizing Spark jobs within the Databricks environment
  • Strong, practical experience with the Ab Initio suite of products (GDE, Co>Operating System, Conduct>It) for designing and implementing enterprise-grade ETL workflows
  • Hands-on experience designing, building, and maintaining data warehouses in Snowflake
  • Experience using federated query engines like Starburst/Trino
  • Familiarity or experience with open table formats like Apache Iceberg for managing large analytic datasets
  • In-depth knowledge and multi-year experience with at least one major cloud provider (AWS, Google Cloud Platform, or Azure)
  • Practical experience building and managing data pipelines using cloud-native services such as AWS Glue, Lambda, S3, Redshift
  • Azure Data Factory, Synapse Analytics
  • or Google Cloud Composer, Dataflow, and BigQuery
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ETL/ELT pipelines using PySpark, Spark SQL, and Delta Lake on Databricks
  • Implement and manage data solutions on cloud platforms
  • Work extensively with big data frameworks and platforms such as Databricks, Snowflake, and open table formats like Apache Iceberg
  • Optimize Spark workloads and Databricks clusters
  • Implement and manage Lakehouse architecture using Delta Lake
  • Lead the design and architecture of Starburst-based data solutions
  • Implement and manage data federation strategies using Starburst connectors
  • Identify and resolve performance bottlenecks in data pipelines and queries
  • Develop and optimize robust data pipelines with a strong focus on data governance
  • Design and implement data models that support business intelligence, analytics, and machine learning use cases
What we offer
What we offer
  • Medical, dental & vision coverage
  • 401(k)
  • Life, accident, and disability insurance
  • Wellness programs
  • Paid time off packages, including planned time off (vacation), unplanned time off (sick leave), and paid holidays
  • Discretionary and formulaic incentive and retention awards
  • Fulltime
!
Read More
Arrow Right

Senior Business & Data Analyst - Vice President

This role is within enterprise data office and product solution team; focused on...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 13+ years of combined experience in banking and financial services industry, information technology and/or data controls and governance
  • Preferably Engineering Graduate with Post Graduation in Finance
  • Extensive experience in Capital Markets business and processes
  • Deep understanding of Derivative products (i.e., Equities, FX, IRS, Commodities etc.) Or SFT (Repo, Reverse Repo, Securities Lending and Borrowing)
  • Strong Data analysis skills using Excel, SQL, Python, Pyspark etc.
  • AI-Accelerated Data Analysis: Leverage AI-powered tools (e.g., AutoML, Python libraries) to clean, analyze, and visualize data from structured and unstructured sources
  • Well versed with Prompt Engineering & Automation utilizing GenAI and LLMs
  • Experience with data management processes and tools and applications, including process mapping and lineage toolsets
  • Actively managed various aspects of data initiatives including analysis, planning, execution, and day-to-day production management
  • Ability to identify and solve problems throughout the product development process
Job Responsibility
Job Responsibility
  • Understand Derivatives and SFT data flows within CITI
  • Data analysis for derivatives products and SFT across systems for target state adoption and resolution of data gaps/issues
  • Lead assessment of end-to-end data flows for all data elements used in Regulatory Reports
  • Document current and target states data mapping and produce gap assessment
  • Coordinate with the business for identifying critical data elements, defining standards and quality expectations, and prioritize remediation of data issues
  • Identify appropriate strategic source for critical data elements
  • Design and Implement data governance controls including data quality rules and data reconciliation
  • Design systematic solution for elimination of manual processes/adjustments and remediation of tactical solutions
  • Prepare detailed requirement specifications containing calculations, data transformations and aggregation logic
  • Perform functional testing and data validations
  • Fulltime
Read More
Arrow Right

Data Analytics Lead - Data Scientist - Vice President

The Data Analytics Lead / Data Scientist is a strategic professional who stays a...
Location
Location
India , Chennai; Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10-15 years of relevant experience in Data Analytics, Data Science, or Advanced Analytics roles
  • Advanced proficiency in SQL and relational database concepts
  • Strong programming experience in Python (required)
  • PySpark preferred
  • Hands-on experience building and deploying machine learning models (supervised and unsupervised)
  • Experience with ML libraries such as scikit-learn, XGBoost, TensorFlow, or PyTorch
  • Strong knowledge of statistical modeling, feature engineering, and model validation techniques
  • Experience with BI tools such as Tableau or Power BI
  • Familiarity with MLOps practices (model deployment, monitoring, versioning) is strongly preferred
  • Experience working with large-scale enterprise or financial datasets
Job Responsibility
Job Responsibility
  • Integrates subject matter and industry expertise within a defined area
  • Contributes to data analytics standards around which others will operate
  • Applies in-depth understanding of how data analytics collectively integrate within the sub-function as well as coordinate and contribute to the objectives of the entire function
  • Employs developed communication and diplomacy skills are required in order to guide, influence and convince others, in particular colleagues in other areas and occasional external customers
  • Resolves occasionally complex and highly variable issues
  • Produces detailed analysis of issues where the best course of action is not evident from the information available, but actions must be recommended/ taken
  • Responsible for volume, quality, timeliness and delivery of data science projects along with short-term planning resource planning
  • Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
  • Lead the design and execution of complex data analysis and AI/ML initiatives across large, structured, and unstructured datasets
  • Develop and deploy predictive, classification, clustering, and forecasting models to support business strategy and risk management
  • Fulltime
Read More
Arrow Right

Data Engineer (Big Data, Python, Databricks) - Assistant Vice President

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Chennai, Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5-8 years of relevant handson experience in Big Data technologies like Cloudera, Python, HQL, Java/PySpark
  • Knowledge on Machine Learning, AI would be added advantage
  • Experience in systems analysis, data analysis and programming of software applications
  • Experience in managing and implementing successful projects
  • Working knowledge of consulting/project management techniques/methods
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Bachelor’s degree/University degree or equivalent experience
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision
  • Can exercise independence of judgement and autonomy
  • Acts as SME to senior stakeholders and /or other team members
  • Fulltime
Read More
Arrow Right