Senior Pyspark Data Engineer Job at Citi (Mississauga)

Senior PySpark Data Engineer

We are seeking a highly skilled and experienced Senior PySpark Data Engineer to ...

Location

India , Pune

Salary:

Not provided

Citi

Expiration Date

Until further notice

Requirements

6+ years of professional relevant experience in a data engineering role
Extensive hands-on experience with PySpark and advanced Python programming skills
Proven experience with Big Data ecosystems, including Cloudera and/or DataBricks
Hands-on experience with distributed query engines like Starburst (Trino/Presto)
Proficient in designing and managing complex workflows using scheduling tools, particularly Apache Airflow
Strong expertise in SQL and experience with relational and non-relational databases
Solid understanding of data warehousing concepts, ETL/ELT processes, and data modeling techniques
Experience working in a Linux/Unix environment
GIT HUB, CI/CD Pipeline
Bachelor’s degree/University degree or equivalent experience

Job Responsibility

Design, develop, and maintain robust, scalable, and high-performance data pipelines using PySpark
Develop, schedule, and monitor complex data workflows using orchestration tools like Apache Airflow
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver high-quality data solutions
Optimize and tune Spark jobs for performance and efficiency
Implement data quality checks and ensure data integrity across all data pipelines
Design and implement data models for optimal storage and retrieval
Mentor junior data engineers and promote best practices in data engineering
Ensure compliance with data governance and security policies
Troubleshoot and resolve data-related issues in a timely manner

Fulltime

Senior Pyspark Data Engineer

The Applications Development Senior Programmer Analyst is an intermediate level ...

Location

Canada , Mississauga

Salary:

94300.00 - 141500.00 USD / Year

Citi

Expiration Date

Until further notice

Requirements

5-8 years of relevant experience
Experience in systems analysis and programming of software applications
Experience in managing and implementing successful projects
Working knowledge of consulting/project management techniques/methods
Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
Big Data Infrastructure: Develop and manage large-scale data processing systems using frameworks like Apache Spark, Hadoop, and Kafka
Proficiency in Python programming
Strong expertise in data processing frameworks such as Apache Spark, Hadoop
Expertise in Data Lakehouse technologies (Apache Iceberg, Trino, Deltalake)
Expertise in SQL and database technologies (e.g., Oracle, PostgreSQL, etc.)

Job Responsibility

Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
Ensure essential procedures are followed and help define operating standards and processes
Serve as advisor or coach to new or lower level analysts
Has the ability to operate with a limited level of direct supervision
Can exercise independence of judgement and autonomy
Acts as SME to senior stakeholders and /or other team members

Fulltime

Senior Data Software Engineer (Python & PySpark) - Vice President

The Senior Data Software Engineer is a senior level position responsible for est...

Location

Singapore , Singapore

Salary:

Not provided

Citi

Expiration Date

Until further notice

Requirements

Bachelor's or Master's degree in Computer Science, Engineering, or a related quantitative field
7+ years of experience in data engineering, with a strong focus on Python and big data technologies
Proven expertise in designing and implementing large-scale data processing solutions using PySpark
Extensive experience with distributed computing frameworks like Apache Spark
Strong understanding of data warehousing concepts, dimensional modeling, and ETL/ELT principles
Proficiency in SQL and experience with various relational and NoSQL databases
Experience with cloud platforms (AWS, Azure, GCP) and their data services (e.g., S3, ADLS, Google Cloud Storage, Redshift, Snowflake, BigQuery, Databricks)
Familiarity with workflow orchestration tools (e.g., Apache Airflow, Azure Data Factory, AWS Step Functions)
Experience with version control systems (e.g., Git)
Excellent problem-solving, analytical, and communication skills.

Job Responsibility

Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.

Fulltime

Senior Data Engineer – Data Engineering & AI Platforms

We are looking for a highly skilled Senior Data Engineer (L2) who can design, bu...

Location

India , Chennai, Madurai, Coimbatore

Salary:

Not provided

OptiSol Business Solutions

Expiration Date

Until further notice

Requirements

Strong hands-on expertise in cloud ecosystems (Azure / AWS / GCP)
Excellent Python programming skills with data engineering libraries and frameworks
Advanced SQL capabilities including window functions, CTEs, and performance tuning
Solid understanding of distributed processing using Spark/PySpark
Experience designing and implementing scalable ETL/ELT workflows
Good understanding of data modeling concepts (dimensional, star, snowflake)
Familiarity with GenAI/LLM-based integration for data workflows
Experience working with Git, CI/CD, and Agile delivery frameworks
Strong communication skills for interacting with clients, stakeholders, and internal teams

Job Responsibility

Design, build, and maintain scalable ETL/ELT pipelines across cloud and big data platforms
Contribute to architectural discussions by translating business needs into data solutions spanning ingestion, transformation, and consumption layers
Work closely with solutioning and pre-sales teams for technical evaluations and client-facing discussions
Lead squads of L0/L1 engineers—ensuring delivery quality, mentoring, and guiding career growth
Develop cloud-native data engineering solutions using Python, SQL, PySpark, and modern data frameworks
Ensure data reliability, performance, and maintainability across the pipeline lifecycle—from development to deployment
Support long-term ODC/T&M projects by demonstrating expertise during technical discussions and interviews
Integrate emerging GenAI tools where applicable to enhance data enrichment, automation, and transformations

What we offer

Opportunity to work at the intersection of Data Engineering, Cloud, and Generative AI
Hands-on exposure to modern data stacks and emerging AI technologies
Collaboration with experts across Data, AI/ML, and cloud practices
Access to structured learning, certifications, and leadership mentoring
Competitive compensation with fast-track career growth and visibility

Fulltime

Senior Data Engineer

*** Senior Data Engineer - 10 Months+ Contract - Hybrid -Vilnius, Lithuania*** R...

Location

Lithuania , Vilnius

Salary:

Not provided

RED Commerce - The Global SAP Solutions Provider

Expiration Date

Until further notice

Requirements

5+ years of experience as a Data Engineer
7+ years of experience in the overall IT area
Strong experience with PySpark and Databricks (AWS environment)
Proven experience building data pipelines and implementing Medallion architecture from scratch
Excellent communication skills with fluent English (spoken and written)
Strong stakeholder management skills with the ability to gather business requirements and translate into technical solutions
Able to join within 2-4 weeks of offer

Senior Data Engineer

Senior Data Engineer. Must Haves: Strong hands-on experience building and suppor...

Location

United States , Los Angeles

Salary:

Not provided

Beacon Hill

Expiration Date

Until further notice

Requirements

Strong hands-on experience building and supporting production-grade data pipelines using PySpark and Apache Spark
Extensive experience working within Databricks environments
Experience with Databricks Unity Catalog migrations and modernization initiatives
Experience with Databricks Unity Catalog governance and security models
Strong understanding of ETL/ELT design, data modeling, and incremental data processing
Experience modernizing legacy data platforms and data migration projects
Experience working with large-scale datasets in cloud-based data platforms
Experience with workflow orchestration, scheduling, and pipeline automation
Knowledge of cloud data platforms such as Snowflake and/or Google Cloud
Experience implementing data quality checks, monitoring, logging, and alerting

Job Responsibility

Migrate existing data assets and pipelines to Databricks Unity Catalog
Refactor and optimize existing pipelines using modern engineering standards and best practices
Build and maintain scalable PySpark-based data pipelines
Develop new source-of-truth datasets and modernize legacy data workflows
Design and implement workflow orchestration using Databricks Jobs
Deploy and manage pipelines using Databricks Asset Bundles
Ensure data pipelines are reliable, observable, and production-ready
Implement data quality controls, monitoring, and operational alerting
Support platform migrations, testing, validation, and production cutovers
Partner with engineering teams to improve platform stability, scalability, and data reliability

Fulltime

Senior Data Engineer

We are seeking an experienced Senior Data Engineer to join a leading automotive ...

Location

Sweden , Gothenburg

Salary:

53333.00 - 66667.00 SEK / Month

Amaris Consulting

Expiration Date

Until further notice

Requirements

Strong hands-on experience with PySpark and Databricks in production environments
Advanced Python programming skills
Experience with Git/Gerrit and CI/CD pipelines (GitHub Actions, Jenkins, GitLab, or similar)
Solid knowledge of Docker and workflow orchestration tools such as Airflow or Databricks Jobs
Expertise in designing scalable, privacy-compliant data architectures and pipelines
Strong SQL skills and experience handling large-scale datasets
Experience with Azure cloud services and Snowflake (or equivalent cloud data warehouse technologies)
Knowledge of monitoring, observability, automated testing, and cost optimization practices
Familiarity with Power BI or similar BI and reporting tools
Understanding of automotive UDS diagnostics and vehicle signal data

Job Responsibility

Design and develop scalable data pipelines using PySpark and Databricks
Build and maintain raw and refined data products related to vehicle signals, software versions, drive cycles, distance measurements, and driver feedback
Optimize data processing performance, reliability, and cost efficiency
Implement monitoring, observability, automated testing, and usage tracking for data products
Collaborate with architects, engineers, Product Owners, and business stakeholders to deliver robust data solutions
Support documentation, knowledge transfer, and operational handovers
Contribute to technical leadership and best practices within the data engineering domain

What we offer

An international community, bringing together 110+ different nationalities
An environment where trust has a central place: 70% of our key leaders started their careers at the first level of responsibilities
A robust training system with our internal Academy and 250+ available modules
A vibrant workplace that frequently gathers for internal events (afterworks, team buildings, etc.)
Opportunity to turn ideas into action and make a tangible impact through ESG commitments
WeCare Together program empowering employees to design and lead projects that create social or environmental impact

Fulltime

Senior Data Engineer

Senior Data Engineer - Media Org (Hybrid NYC or LA). I’m working on a Senior Dat...

Location

United States , New York City or Los Angeles

Salary:

Not provided

Lawrence Harvey

Expiration Date

Until further notice

Requirements

Strong Data Engineering experience building production data pipelines
Strong PySpark and SQL expertise
Experience with Databricks or similar Spark-based cloud platforms
Experience building batch and streaming pipelines
Experience with orchestration tools such as Airflow or similar
Familiarity with AWS environments, data lakes, and modern cloud data architecture
Experience with modern Lakehouse design patterns, including Medallion Architecture (Bronze, Silver / Gold layers) for structuring scalable and modular data pipelines

Job Responsibility

Designing batch and real-time data pipelines
Working heavily within a Databricks environment
Helping shape modern lakehouse architecture
Partnering closely with analytics, product, and data science teams

Fulltime

Select Country

Senior Pyspark Data Engineer

Job Description

Job Responsibility

Requirements

Nice to have

Looking for more opportunities?

Senior Pyspark Data Engineer

Senior PySpark Data Engineer

Senior Pyspark Data Engineer

Senior Data Software Engineer (Python & PySpark) - Vice President

Senior Data Engineer – Data Engineering & AI Platforms

Senior Data Engineer

Senior Data Engineer

Senior Data Engineer

Senior Data Engineer

Our AI answers in your language