CrawlJobs Logo

Data Engineer - AWS, PySpark

barclays.co.uk Logo

Barclays

Location Icon

Location:
India , Bengaluru

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

You will be responsible for supporting the successful delivery of Location Strategy projects to plan, budget, agreed quality and governance standards. You'll spearhead the evolution of our digital landscape, driving innovation and excellence. You will harness cutting-edge technology to revolutionise our digital offerings, ensuring unparalleled customer experiences. To build and maintain the systems that collect, store, process, and analyse data, such as data pipelines, data warehouses and data lakes to ensure that all data is accurate, accessible, and secure.

Job Responsibility:

  • Build and maintenance of data architectures pipelines that enable the transfer and processing of durable, complete and consistent data
  • Design and implementation of data warehoused and data lakes that manage the appropriate data volumes and velocity and adhere to the required security measures
  • Development of processing and analysis algorithms fit for the intended data complexity and volumes
  • Collaboration with data scientist to build and deploy machine learning models

Requirements:

  • Hands on experience in pyspark and strong knowledge on Dataframes, RDD and SparkSQL
  • Hands on Experience in developing, testing and maintaining applications on AWS Cloud
  • Strong hold on AWS Data Analytics Technology Stack (Glue, S3, Lambda, Lake formation, Athena)
  • Design and implement scalable and efficient data transformation/storage solutions using Snowflake
  • Experience in Data ingestion to Snowflake for different storage format such Parquet, Iceberg, JSON, CSV etc
  • Experience in using DBT (Data Build Tool) with snowflake for ELT pipeline development
  • Experience in Writing advanced SQL and PL SQL programs
  • Hands On Experience for building reusable components using Snowflake and AWS Tools/Technology
  • Should have worked at least on two major project implementations
  • Exposure to data governance or lineage tools such as Immuta and Alation is added advantage
  • Experience in using Orchestration tools such as Apache Airflow or Snowflake Tasks is added advantage
  • Knowledge on Abinitio ETL tool is a plus

Nice to have:

  • Ability to engage with Stakeholders, elicit requirements/ user stories and translate requirements into ETL components
  • Ability to understand the infrastructure setup and be able to provide solutions either individually or working with teams
  • Good knowledge of Data Marts and Data Warehousing concepts
  • Resource should possess good analytical and Interpersonal skills
  • Implement Cloud based Enterprise data warehouse with multiple data platform along with Snowflake and NoSQL environment to build data movement strategy
What we offer:
  • Competitive holiday allowance
  • Life assurance
  • Private medical care
  • Pension contribution

Additional Information:

Job Posted:
January 06, 2026

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Data Engineer - AWS, PySpark

Senior Data Engineer – Data Engineering & AI Platforms

We are looking for a highly skilled Senior Data Engineer (L2) who can design, bu...
Location
Location
India , Chennai, Madurai, Coimbatore
Salary
Salary:
Not provided
optisolbusiness.com Logo
OptiSol Business Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong hands-on expertise in cloud ecosystems (Azure / AWS / GCP)
  • Excellent Python programming skills with data engineering libraries and frameworks
  • Advanced SQL capabilities including window functions, CTEs, and performance tuning
  • Solid understanding of distributed processing using Spark/PySpark
  • Experience designing and implementing scalable ETL/ELT workflows
  • Good understanding of data modeling concepts (dimensional, star, snowflake)
  • Familiarity with GenAI/LLM-based integration for data workflows
  • Experience working with Git, CI/CD, and Agile delivery frameworks
  • Strong communication skills for interacting with clients, stakeholders, and internal teams
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ETL/ELT pipelines across cloud and big data platforms
  • Contribute to architectural discussions by translating business needs into data solutions spanning ingestion, transformation, and consumption layers
  • Work closely with solutioning and pre-sales teams for technical evaluations and client-facing discussions
  • Lead squads of L0/L1 engineers—ensuring delivery quality, mentoring, and guiding career growth
  • Develop cloud-native data engineering solutions using Python, SQL, PySpark, and modern data frameworks
  • Ensure data reliability, performance, and maintainability across the pipeline lifecycle—from development to deployment
  • Support long-term ODC/T&M projects by demonstrating expertise during technical discussions and interviews
  • Integrate emerging GenAI tools where applicable to enhance data enrichment, automation, and transformations
What we offer
What we offer
  • Opportunity to work at the intersection of Data Engineering, Cloud, and Generative AI
  • Hands-on exposure to modern data stacks and emerging AI technologies
  • Collaboration with experts across Data, AI/ML, and cloud practices
  • Access to structured learning, certifications, and leadership mentoring
  • Competitive compensation with fast-track career growth and visibility
  • Fulltime
Read More
Arrow Right

Software Engineer (Data Engineering)

We are seeking a Software Engineer (Data Engineering) who can seamlessly integra...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
nstarxinc.com Logo
NStarX
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 4+ years in Data Engineering and AI/ML roles
  • Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field
  • Python, SQL, Bash, PySpark, Spark SQL, boto3, pandas
  • Apache Spark on EMR (driver/executor model, sizing, dynamic allocation)
  • Amazon S3 (Parquet) with lifecycle management to Glacier
  • AWS Glue Catalog and Crawlers
  • AWS Step Functions, AWS Lambda, Amazon EventBridge
  • CloudWatch Logs and Metrics, Kinesis Data Firehose (or Kafka/MSK)
  • Amazon Redshift and Redshift Spectrum
  • IAM (least privilege), Secrets Manager, SSM
Job Responsibility
Job Responsibility
  • Design, build, and maintain scalable ETL and ELT pipelines for large-scale data processing
  • Develop and optimize data architectures supporting analytics and ML workflows
  • Ensure data integrity, security, and compliance with organizational and industry standards
  • Collaborate with DevOps teams to deploy and monitor data pipelines in production environments
  • Build predictive and prescriptive models leveraging AI and ML techniques
  • Develop and deploy machine learning and deep learning models using TensorFlow, PyTorch, or Scikit-learn
  • Perform feature engineering, statistical analysis, and data preprocessing
  • Continuously monitor and optimize models for accuracy and scalability
  • Integrate AI-driven insights into business processes and strategies
  • Serve as the technical liaison between NStarX and client teams
What we offer
What we offer
  • Competitive salary and performance-based incentives
  • Opportunity to work on cutting-edge AI and ML projects
  • Exposure to global clients and international project delivery
  • Continuous learning and professional development opportunities
  • Competitive base + commission
  • Fast growth into leadership roles
  • Fulltime
Read More
Arrow Right

Senior Data Engineer

At Rearc, we're committed to empowering engineers to build awesome products and ...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
rearc.io Logo
Rearc
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of experience in data engineering, showcasing expertise in diverse architectures, technology stacks, and use cases
  • Strong expertise in designing and implementing data warehouse and data lake architectures, particularly in AWS environments
  • Extensive experience with Python for data engineering tasks, including familiarity with libraries and frameworks commonly used in Python-based data engineering workflows
  • Proven experience with data pipeline orchestration using platforms such as Airflow, Databricks, DBT or AWS Glue
  • Hands-on experience with data analysis tools and libraries like Pyspark, NumPy, Pandas, or Dask
  • Proficiency with Spark and Databricks is highly desirable
  • Experience with SQL and NoSQL databases, including PostgreSQL, Amazon Redshift, Delta Lake, Iceberg and DynamoDB
  • In-depth knowledge of data architecture principles and best practices, especially in cloud environments
  • Proven experience with AWS services, including expertise in using AWS CLI, SDK, and Infrastructure as Code (IaC) tools such as Terraform, CloudFormation, or AWS CDK
  • Exceptional communication skills, capable of clearly articulating complex technical concepts to both technical and non-technical stakeholders
Job Responsibility
Job Responsibility
  • Strategic Data Engineering Leadership: Provide strategic vision and technical leadership in data engineering, guiding the development and execution of advanced data strategies that align with business objectives
  • Architect Data Solutions: Design and architect complex data pipelines and scalable architectures, leveraging advanced tools and frameworks (e.g., Apache Kafka, Kubernetes) to ensure optimal performance and reliability
  • Drive Innovation: Lead the exploration and adoption of new technologies and methodologies in data engineering, driving innovation and continuous improvement across data processes
  • Technical Expertise: Apply deep expertise in ETL processes, data modelling, and data warehousing to optimize data workflows and ensure data integrity and quality
  • Collaboration and Mentorship: Collaborate closely with cross-functional teams to understand requirements and deliver impactful data solutions—mentor and coach junior team members, fostering their growth and development in data engineering practices
  • Thought Leadership: Contribute to thought leadership in the data engineering domain through technical articles, conference presentations, and participation in industry forums
Read More
Arrow Right

Senior Data Engineer

Senior Data Engineer position at Checkr, building the data platform to power saf...
Location
Location
United States , San Francisco
Salary
Salary:
162000.00 - 190000.00 USD / Year
https://checkr.com Logo
Checkr
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of development experience in the field of data engineering
  • 5+ years writing PySpark
  • Experience building large-scale (100s of Terabytes and Petabytes) data processing pipelines - batch and stream
  • Experience with ETL/ELT, stream and batch processing of data at scale
  • Strong proficiency in PySpark and Python
  • Expertise in understanding of database systems, data modeling, relational databases, NoSQL (such as MongoDB)
  • Experience with big data technologies such as Kafka, Spark, Iceberg, Datalake and AWS stack (EKS, EMR, Serverless, Glue, Athena, S3, etc.)
  • Knowledge of security best practices and data privacy concerns
  • Strong problem-solving skills and attention to detail
Job Responsibility
Job Responsibility
  • Create and maintain data pipelines and foundational datasets to support product/business needs
  • Design and build database architectures with massive and complex data, balancing with computational load and cost
  • Develop audits for data quality at scale, implementing alerting as necessary
  • Create scalable dashboards and reports to support business objectives and enable data-driven decision-making
  • Troubleshoot and resolve complex issues in production environments
  • Work closely with product managers and other stakeholders to define and implement new features
What we offer
What we offer
  • Learning and development reimbursement allowance
  • Competitive compensation and opportunity for professional and personal advancement
  • 100% medical, dental, and vision coverage for employees and dependents
  • Additional vacation benefits of 5 extra days and flexibility to take time off
  • Reimbursement for work from home equipment
  • Lunch four times a week
  • Commuter stipend
  • Abundance of snacks and beverages
  • Fulltime
Read More
Arrow Right

Principal Data Engineer

We are seeking an experienced Principal Data Engineer to define, lead, and scale...
Location
Location
United Kingdom , Thame; Leeds
Salary
Salary:
80000.00 - 100000.00 GBP / Year
pexa.co.uk Logo
PEXA UK
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Broad experience as a Data Engineer including technical leadership
  • Broad cloud experience, ideally both Azure and AWS
  • Deep expertise in PySpark and distributed data processing at scale
  • Extensive experience designing and optimising in Databricks
  • Advanced SQL optimisation and schema design for analytical workloads
  • Strong understanding of data security, privacy, and GDPR/PII compliance
  • Experience implementing and leading data governance frameworks
  • Proven experience leading the design and operation of a complex data platform
  • Track record of mentoring engineers and raising technical standards
  • Ability to influence senior stakeholders and align data initiatives with wider business goals
Job Responsibility
Job Responsibility
  • Design and oversee scalable, performant, and secure architectures on Databricks and distributed systems
  • Anticipate scaling challenges and ensure platforms are future-proof
  • Lead the design and development of robust, high-performance data pipelines using PySpark and Databricks
  • Define and ensure testing frameworks for data workflows
  • Ensure end-to-end data quality from raw ingestion to curated, trusted datasets powering analytics
  • Establish and enforce best practices for data governance, lineage, metadata, and security controls
  • Ensure compliance with GDPR and other regulatory frameworks
  • Act as a technical authority and mentor, guiding data engineers
  • Influence cross-functional teams to align on data strategy, standards, and practices
  • Partner with product, engineering, and business leaders to prioritise and deliver high-impact data initiatives
  • Fulltime
Read More
Arrow Right

Senior Data Engineering Architect

Location
Location
Poland
Salary
Salary:
Not provided
lingarogroup.com Logo
Lingaro
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven work experience as a Data Engineering Architect or a similar role and strong experience in in the Data & Analytics area
  • Strong understanding of data engineering concepts, including data modeling, ETL processes, data pipelines, and data governance
  • Expertise in designing and implementing scalable and efficient data processing frameworks
  • In-depth knowledge of various data technologies and tools, such as relational databases, NoSQL databases, data lakes, data warehouses, and big data frameworks (e.g., Hadoop, Spark)
  • Experience in selecting and integrating appropriate technologies to meet business requirements and long-term data strategy
  • Ability to work closely with stakeholders to understand business needs and translate them into data engineering solutions
  • Strong analytical and problem-solving skills, with the ability to identify and address complex data engineering challenges
  • Proficiency in Python, PySpark, SQL
  • Familiarity with cloud platforms and services, such as AWS, GCP, or Azure, and experience in designing and implementing data solutions in a cloud environment
  • Knowledge of data governance principles and best practices, including data privacy and security regulations
Job Responsibility
Job Responsibility
  • Collaborate with stakeholders to understand business requirements and translate them into data engineering solutions
  • Design and oversee the overall data architecture and infrastructure, ensuring scalability, performance, security, maintainability, and adherence to industry best practices
  • Define data models and data schemas to meet business needs, considering factors such as data volume, velocity, variety, and veracity
  • Select and integrate appropriate data technologies and tools, such as databases, data lakes, data warehouses, and big data frameworks, to support data processing and analysis
  • Create scalable and efficient data processing frameworks, including ETL (Extract, Transform, Load) processes, data pipelines, and data integration solutions
  • Ensure that data engineering solutions align with the organization's long-term data strategy and goals
  • Evaluate and recommend data governance strategies and practices, including data privacy, security, and compliance measures
  • Collaborate with data scientists, analysts, and other stakeholders to define data requirements and enable effective data analysis and reporting
  • Provide technical guidance and expertise to data engineering teams, promoting best practices and ensuring high-quality deliverables. Support to team throughout the implementation process, answering questions and addressing issues as they arise
  • Oversee the implementation of the solution, ensuring that it is implemented according to the design documents and technical specifications
What we offer
What we offer
  • Stable employment. On the market since 2008, 1500+ talents currently on board in 7 global sites
  • Workation. Enjoy working from inspiring locations in line with our workation policy
  • Great Place to Work® certified employer
  • Flexibility regarding working hours and your preferred form of contract
  • Comprehensive online onboarding program with a “Buddy” from day 1
  • Cooperation with top-tier engineers and experts
  • Unlimited access to the Udemy learning platform from day 1
  • Certificate training programs. Lingarians earn 500+ technology certificates yearly
  • Upskilling support. Capability development programs, Competency Centers, knowledge sharing sessions, community webinars, 110+ training opportunities yearly
  • Grow as we grow as a company. 76% of our managers are internal promotions
Read More
Arrow Right

Staff Data Engineer

Checkr is hiring an experienced Staff Data Engineer to join their Data Platform ...
Location
Location
United States , San Francisco; Denver
Salary
Salary:
166000.00 - 230000.00 USD / Year
https://checkr.com Logo
Checkr
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of designing, implementing and delivering highly scalable and performant data platform
  • experience building large-scale (100s of Terabytes and Petabytes) data processing pipelines - batch and stream
  • experience with ETL/ELT, stream and batch processing of data at scale
  • expert level proficiency in PySpark, Python, and SQL
  • expertise in data modeling, relational databases, NoSQL (such as MongoDB) data stores
  • experience with big data technologies such as Kafka, Spark, Iceberg, Datalake, and AWS stack (EKS, EMR, Serverless, Glue, Athena, S3, etc.)
  • an understanding of Graph and Vector data stores (preferred)
  • knowledge of security best practices and data privacy concerns
  • strong problem-solving skills and attention to detail
  • experience/knowledge of data processing platforms such as Databricks or Snowflake.
Job Responsibility
Job Responsibility
  • Architect, design, lead and build end-to-end performant, reliable, scalable data platform
  • monitor, investigate, triage, and resolve production issues as they arise for services owned by the team
  • mentor, guide and work with junior engineers to deliver complex and next-generation features
  • partner with engineering, product, design, and other stakeholders in designing and architecting new features
  • create and maintain data pipelines and foundational datasets to support product/business needs
  • experiment with rapid MVPs and encourage validation of customer needs
  • design and build database architectures with massive and complex data
  • develop audits for data quality at scale
  • create scalable dashboards and reports to support business objectives and enable data-driven decision-making
  • troubleshoot and resolve complex issues in production environments.
What we offer
What we offer
  • A fast-paced and collaborative environment
  • learning and development allowance
  • competitive cash and equity compensation and opportunity for advancement
  • 100% medical, dental, and vision coverage
  • up to $25K reimbursement for fertility, adoption, and parental planning services
  • flexible PTO policy
  • monthly wellness stipend
  • home office stipend
  • in-office perks such as lunch four times a week, a commuter stipend, and an abundance of snacks and beverages.
  • Fulltime
Read More
Arrow Right

Senior Big Data Engineer

The Big Data Engineer is a senior level position responsible for establishing an...
Location
Location
Canada , Mississauga
Salary
Salary:
94300.00 - 141500.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ Years of Experience in Big Data Engineering (PySpark)
  • Data Pipeline Development: Design, build, and maintain scalable ETL/ELT pipelines to ingest, transform, and load data from multiple sources
  • Big Data Infrastructure: Develop and manage large-scale data processing systems using frameworks like Apache Spark, Hadoop, and Kafka
  • Proficiency in programming languages like Python, or Scala
  • Strong expertise in data processing frameworks such as Apache Spark, Hadoop
  • Expertise in Data Lakehouse technologies (Apache Iceberg, Apache Hudi, Trino)
  • Experience with cloud data platforms like AWS (Glue, EMR, Redshift), Azure (Synapse), or GCP (BigQuery)
  • Expertise in SQL and database technologies (e.g., Oracle, PostgreSQL, etc.)
  • Experience with data orchestration tools like Apache Airflow or Prefect
  • Familiarity with containerization (Docker, Kubernetes) is a plus
Job Responsibility
Job Responsibility
  • Partner with multiple management teams to ensure appropriate integration of functions to meet goals as well as identify and define necessary system enhancements to deploy new products and process improvements
  • Resolve variety of high impact problems/projects through in-depth evaluation of complex business processes, system processes, and industry standards
  • Provide expertise in area and advanced knowledge of applications programming and ensure application design adheres to the overall architecture blueprint
  • Utilize advanced knowledge of system flow and develop standards for coding, testing, debugging, and implementation
  • Develop comprehensive knowledge of how areas of business, such as architecture and infrastructure, integrate to accomplish business goals
  • Provide in-depth analysis with interpretive thinking to define issues and develop innovative solutions
  • Serve as advisor or coach to mid-level developers and analysts, allocating work as necessary
  • Appropriately assess risk when business decisions are made, demonstrating consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency
What we offer
What we offer
  • Well-being support
  • Growth opportunities
  • Work-life balance support
  • Fulltime
Read More
Arrow Right