CrawlJobs Logo

Lead PySpark Engineer

https://www.randstad.com Logo

Randstad

Location Icon

Location:
United Kingdom , City of London

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As the Technical Lead, you will drive the high-stakes migration of legacy SAS analytics to a modern, cloud-native PySpark ecosystem on AWS. This isn't just a lift and shift you will refactor complex procedural logic into scalable, production-ready distributed pipelines for a Tier-1 financial services environment.

Job Responsibility:

  • Engineering Leadership: Design and develop complex ETL/ELT pipelines and Data Marts using PySpark, EMR, and Glue
  • Legacy Modernisation: Architect the conversion of SAS Base/Macros into modular, testable Python code using SAS2PY and manual refactoring
  • Performance Tuning: Optimise Spark execution (partitioning, shuffling, caching) to ensure cost-efficient processing of massive financial datasets
  • Quality & Governance: Implement rigorous CI/CD, unit testing, and data reconciliation frameworks to ensure "penny-perfect" accuracy

Requirements:

  • Expert in PySpark
  • Expert in Python with Clean Code/SOLID principles
  • Proficiency in reading/debugging SAS (Base, Macros, DI Studio)
  • Experience with AWS: EMR, Glue, S3, Athena, IAM, Lambda
  • Experience with Data Modeling: SCD Type 2, Fact/Dimension tables, Data Vault/Star Schema
  • Experience with DevOps: Git-based workflows, Jenkins/GitLab CI, Terraform

Additional Information:

Job Posted:
April 01, 2026

Expiration:
April 08, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Lead PySpark Engineer

Pyspark Module Lead

We are seeking a highly skilled and motivated Data Engineer to join our dynamic ...
Location
Location
India , Noida
Salary
Salary:
Not provided
https://www.soprasteria.com Logo
Sopra Steria
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in Advanced SQL (Window functions), Spark Architecture, Pyspark or Scala with Spark, Hadoop
  • Proven expertise in designing and deploying data pipelines
  • Strong problem-solving skills and ability to work effectively in a collaborative team environment
  • Excellent communication skills and ability to translate technical concepts to non-technical stakeholders
Job Responsibility
Job Responsibility
  • Work in tandem with Data Scientists to design, develop, and implement machine learning pipelines
  • Utilize PySpark for data processing, transformation, and preparation for model training
  • Leverage AWS EMR and S3 for scalable and efficient data storage and processing
  • Implement and manage ETL workflows using Streamsets for data ingestion and transformation
  • Design and construct pipelines to deliver high-quality training and inference datasets
  • Collaborate with cross-functional teams to ensure smooth deployment and real-time/near real-time inferencing capabilities
  • Optimize and fine-tune pipelines for performance, scalability, and reliability
  • Ensure IAM policies and permissions are appropriately configured for secure data access and management
  • Implement Spark architecture and optimize Spark jobs for scalable data processing
What we offer
What we offer
  • All positions are open to people with disabilities
  • Commitment to fighting against all forms of discrimination
  • Inclusive and respectful work environment
Read More
Arrow Right

Lead Data Engineer

We are seeking an experienced Senior Data Engineer to lead the development of a ...
Location
Location
India , Kochi; Trivandrum
Salary
Salary:
Not provided
experionglobal.com Logo
Experion Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years experience in data engineering with analytical platform development focus
  • Proficiency in Python and/or PySpark
  • Strong SQL skills for ETL processes and large-scale data manipulation
  • Extensive AWS experience (Glue, Lambda, Step Functions, S3)
  • Familiarity with big data systems (AWS EMR, Apache Spark, Apache Iceberg)
  • Database experience with DynamoDB, Aurora, Postgres, or Redshift
  • Proven experience designing and implementing RESTful APIs
  • Hands-on CI/CD pipeline experience (preferably GitLab)
  • Agile development methodology experience
  • Strong problem-solving abilities and attention to detail
Job Responsibility
Job Responsibility
  • Architect, develop, and maintain end-to-end data ingestion framework for extracting, transforming, and loading data from diverse sources
  • Use AWS services (Glue, Lambda, EMR, ECS, EC2, Step Functions) to build scalable, resilient automated data pipelines
  • Develop and implement automated data quality checks, validation routines, and error-handling mechanisms
  • Establish comprehensive monitoring, logging, and alerting systems for data quality issues
  • Architect and develop secure, high-performance APIs for data services integration
  • Create thorough API documentation and establish standards for security, versioning, and performance
  • Work with business stakeholders, data scientists, and operations teams to understand requirements
  • Participate in sprint planning, code reviews, and agile ceremonies
  • Contribute to CI/CD pipeline development using GitLab
Read More
Arrow Right

Data Engineering Lead

Embark on an exciting journey into the realm of software product development wit...
Location
Location
India
Salary
Salary:
Not provided
3pillarglobal.com Logo
3Pillar Global
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in Data Engineering or related field, including 2+ years in a lead role
  • Expert-level proficiency with AWS data services (e.g., Glue, EMR, Lambda, Redshift, S3, Kinesis, Step Functions)
  • Strong Python skills for data processing, automation, and pipeline development
  • Experience building batch and streaming pipelines (Spark, PySpark, Kafka, Kinesis, etc.)
  • Strong SQL expertise and experience with relational and NoSQL databases
  • Hands-on experience with IaC (Terraform, CloudFormation, CDK)
  • Familiarity with DevOps tools for CI/CD (e.g., GitHub Actions, GitLab CI, Jenkins)
  • Understanding of data modeling, data warehousing concepts, and distributed systems
  • Fulltime
Read More
Arrow Right

Lead Data Engineer

As a Lead Data Engineer at Rearc, you'll play a pivotal role in establishing and...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
rearc.io Logo
Rearc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in data engineering, data architecture, or related fields
  • Extensive experience in writing and testing Java and/or Python
  • Proven experience with data pipeline orchestration using platforms such as Airflow, Databricks, DBT or AWS Glue
  • Hands-on experience with data analysis tools and libraries like Pyspark, NumPy, Pandas, or Dask
  • Proficiency with Spark and Databricks is highly desirable
  • Proven track record of leading complex data engineering projects, including designing and implementing scalable data solutions
  • Hands-on experience with ETL processes, data warehousing, and data modeling tools
  • In-depth knowledge of data integration tools and best practices
  • Strong understanding of cloud-based data services and technologies (e.g., AWS Redshift, Azure Synapse Analytics, Google BigQuery)
  • Strong strategic and analytical skills
Job Responsibility
Job Responsibility
  • Understand Requirements and Challenges: Collaborate with stakeholders to deeply understand their data requirements and challenges
  • Implement with a DataOps Mindset: Embrace a DataOps mindset and utilize modern data engineering tools and frameworks, such as Apache Airflow, Apache Spark, or similar, to build scalable and efficient data pipelines and architectures
  • Lead Data Engineering Projects: Take the lead in managing and executing data engineering projects, providing technical guidance and oversight to ensure successful project delivery
  • Mentor Data Engineers: Share your extensive knowledge and experience in data engineering with junior team members, guiding and mentoring them to foster their growth and development in the field
  • Promote Knowledge Sharing: Contribute to our knowledge base by writing technical blogs and articles, promoting best practices in data engineering, and contributing to a culture of continuous learning and innovation
Read More
Arrow Right

Pyspark Module Lead

Sopra Steria is seeking a PySpark Module Lead based in Bengaluru, India. The pos...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
https://www.soprasteria.com Logo
Sopra Steria
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Python
  • PySpark
  • SQL
  • JavaScript
  • Proficient with Skywise platform and tools e.g. Contour, Code-Workbook, Code, Slate and Ontology
Job Responsibility
Job Responsibility
  • Designing and building solutions for Airbus on foundry
  • Data engineering and integration
  • Data analysis and visualization
  • Application development within Skywise Framework
  • Working closely with Product Owners and core Airbus business functions to understand processes, capture requirements, and solve business problems
  • Full delivery lifecycle for designing, implementing, testing, documenting, and supporting applications
  • CI/CD pipeline development and automated deployments
  • Troubleshooting application issues
  • Providing support on developed tools
What we offer
What we offer
  • Inclusive and respectful work environment
  • Open to people with disabilities
  • Fulltime
Read More
Arrow Right

Big Data Engineering Lead

The Senior Big Data engineering lead will play a pivotal role in designing, impl...
Location
Location
India , Chennai
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master’s degree in Computer Science, Information Technology, or related field
  • Atleast 10 -12 years overall software development experience on majorly working with handling application with large scale data volumes from ingestion, persistence and retrieval
  • Deep understanding of big data technologies, including Hadoop, Spark, Kafka, Flink, NoSQL databases, etc.
  • Experience with Bigdata technologies Developer Hadoop, Apache Spark, Python, PySpark
  • Strong programming skills in languages such as Java, Scala, or Python
  • Excellent problem-solving skills with a knack for innovative solutions
  • Strong communication and leadership abilities
  • Proven ability to manage multiple projects simultaneously and deliver results
Job Responsibility
Job Responsibility
  • Lead the design and development of a robust and scalable big data architecture handling exponential data growth while maintaining high availability and resilience
  • Design complex data transformation processes using Spark and other big data technologies using Java, Pyspark or Scala
  • Design and implement data pipelines that ensure data quality, integrity, and availability
  • Collaborate with cross-functional teams to understand business needs and translate them into technical requirements
  • Evaluate and select technologies that improve data efficiency, scalability, and performance
  • Oversee the deployment and management of big data tools and frameworks such as Hadoop, Spark, Kafka, and others
  • Provide technical guidance and mentorship to the development team and junior architects
  • Continuously assess and integrate emerging technologies and methodologies to enhance data processing capabilities
  • Optimize big data frameworks, such as Hadoop, Spark, for performance improvements and reduced processing time across distributed systems
  • Implement data governance frameworks to ensure data accuracy, consistency, and privacy across the organization, leveraging metadata management and data lineage tracking
  • Fulltime
Read More
Arrow Right

Data Engineering Architect

Data engineering involves the development of solutions for the collection, trans...
Location
Location
India
Salary
Salary:
Not provided
lingarogroup.com Logo
Lingaro
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years’ experience in the Data & Analytics area
  • 4+ years’ experience into Data Engineering Architecture
  • Proficiency in Python, PySpark, SQL
  • Strong expertise in Azure cloud services such as: ADF, databricks, pyspark, Logic app
  • Strong understanding of data engineering concepts, including data modeling, ETL processes, data pipelines, and data governance
  • Expertise in designing and implementing scalable and efficient data processing frameworks
  • In-depth knowledge of various data technologies and tools, such as relational databases, NoSQL databases, data lakes, data warehouses, and big data frameworks (e.g., Hadoop, Spark)
  • Experience in selecting and integrating appropriate technologies to meet business requirements and long-term data strategy
  • Ability to work closely with stakeholders to understand business needs and translate them into data engineering solutions
  • Strong analytical and problem-solving skills, with the ability to identify and address complex data engineering challenges
Job Responsibility
Job Responsibility
  • Collaborate with stakeholders to understand business requirements and translate them into data engineering solutions
  • Design and oversee the overall data architecture and infrastructure, ensuring scalability, performance, security, maintainability, and adherence to industry best practices
  • Define data models and data schemas to meet business needs, considering factors such as data volume, velocity, variety, and veracity
  • Select and integrate appropriate data technologies and tools, such as databases, data lakes, data warehouses, and big data frameworks, to support data processing and analysis
  • Create scalable and efficient data processing frameworks, including ETL (Extract, Transform, Load) processes, data pipelines, and data integration solutions
  • Ensure that data engineering solutions align with the organization's long-term data strategy and goals
  • Evaluate and recommend data governance strategies and practices, including data privacy, security, and compliance measures
  • Collaborate with data scientists, analysts, and other stakeholders to define data requirements and enable effective data analysis and reporting
  • Provide technical guidance and expertise to data engineering teams, promoting best practices and ensuring high-quality deliverables
  • Support to team throughout the implementation process, answering questions and addressing issues as they arise
What we offer
What we offer
  • Stable employment
  • “Office as an option” model
  • Flexibility regarding working hours and your preferred form of contract
  • Comprehensive online onboarding program with a “Buddy” from day 1
  • Cooperation with top-tier engineers and experts
  • Unlimited access to the Udemy learning platform from day 1
  • Certificate training programs
  • Upskilling support
  • Internal Gallup Certified Strengths Coach to support your growth
  • Grow as we grow as a company
Read More
Arrow Right

Senior Data Engineering Architect

Location
Location
Poland
Salary
Salary:
Not provided
lingarogroup.com Logo
Lingaro
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven work experience as a Data Engineering Architect or a similar role and strong experience in in the Data & Analytics area
  • Strong understanding of data engineering concepts, including data modeling, ETL processes, data pipelines, and data governance
  • Expertise in designing and implementing scalable and efficient data processing frameworks
  • In-depth knowledge of various data technologies and tools, such as relational databases, NoSQL databases, data lakes, data warehouses, and big data frameworks (e.g., Hadoop, Spark)
  • Experience in selecting and integrating appropriate technologies to meet business requirements and long-term data strategy
  • Ability to work closely with stakeholders to understand business needs and translate them into data engineering solutions
  • Strong analytical and problem-solving skills, with the ability to identify and address complex data engineering challenges
  • Proficiency in Python, PySpark, SQL
  • Familiarity with cloud platforms and services, such as AWS, GCP, or Azure, and experience in designing and implementing data solutions in a cloud environment
  • Knowledge of data governance principles and best practices, including data privacy and security regulations
Job Responsibility
Job Responsibility
  • Collaborate with stakeholders to understand business requirements and translate them into data engineering solutions
  • Design and oversee the overall data architecture and infrastructure, ensuring scalability, performance, security, maintainability, and adherence to industry best practices
  • Define data models and data schemas to meet business needs, considering factors such as data volume, velocity, variety, and veracity
  • Select and integrate appropriate data technologies and tools, such as databases, data lakes, data warehouses, and big data frameworks, to support data processing and analysis
  • Create scalable and efficient data processing frameworks, including ETL (Extract, Transform, Load) processes, data pipelines, and data integration solutions
  • Ensure that data engineering solutions align with the organization's long-term data strategy and goals
  • Evaluate and recommend data governance strategies and practices, including data privacy, security, and compliance measures
  • Collaborate with data scientists, analysts, and other stakeholders to define data requirements and enable effective data analysis and reporting
  • Provide technical guidance and expertise to data engineering teams, promoting best practices and ensuring high-quality deliverables. Support to team throughout the implementation process, answering questions and addressing issues as they arise
  • Oversee the implementation of the solution, ensuring that it is implemented according to the design documents and technical specifications
What we offer
What we offer
  • Stable employment. On the market since 2008, 1500+ talents currently on board in 7 global sites
  • Workation. Enjoy working from inspiring locations in line with our workation policy
  • Great Place to Work® certified employer
  • Flexibility regarding working hours and your preferred form of contract
  • Comprehensive online onboarding program with a “Buddy” from day 1
  • Cooperation with top-tier engineers and experts
  • Unlimited access to the Udemy learning platform from day 1
  • Certificate training programs. Lingarians earn 500+ technology certificates yearly
  • Upskilling support. Capability development programs, Competency Centers, knowledge sharing sessions, community webinars, 110+ training opportunities yearly
  • Grow as we grow as a company. 76% of our managers are internal promotions
Read More
Arrow Right