CrawlJobs Logo

Pyspark Technical Lead

https://www.soprasteria.com Logo

Sopra Steria

Location Icon

Location:
India , Chennai

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are seeking a highly skilled and motivated Data Engineer to join our dynamic team. As a Data Engineer, you will collaborate closely with our Data Scientists to develop and deploy machine learning models. Proficiency in the listed skills will be crucial in building and maintaining pipelines for training and inference datasets.

Job Responsibility:

  • Work in tandem with Data Scientists to design, develop, and implement machine learning pipelines
  • Utilize PySpark for data processing, transformation, and preparation for model training
  • Leverage AWS EMR and S3 for scalable and efficient data storage and processing
  • Implement and manage ETL workflows using Streamsets for data ingestion and transformation
  • Design and construct pipelines to deliver high-quality training and inference datasets
  • Collaborate with cross-functional teams to ensure smooth deployment and real-time/near real-time inferencing capabilities
  • Optimize and fine-tune pipelines for performance, scalability, and reliability
  • Ensure IAM policies and permissions are appropriately configured for secure data access and management
  • Implement Spark architecture and optimize Spark jobs for scalable data processing

Requirements:

  • Proficiency in Advanced SQL (Window functions), Spark Architecture, Pyspark or Scala with Spark, Hadoop
  • Proven expertise in designing and deploying data pipelines
  • Strong problem-solving skills and ability to work effectively in a collaborative team environment
  • Excellent communication skills and ability to translate technical concepts to non-technical stakeholder

Nice to have:

  • Hands-on experience with Airflow, S3, and Streamsets or similar ETL tools
  • Understanding of real-time or near real-time inferencing architectures
  • Basic Knowledge on Kafka, AWS IAM, AWS EMR, and Snowflake
What we offer:
  • Inclusive and respectful work environment
  • Open positions for people with disabilities

Additional Information:

Job Posted:
April 26, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Pyspark Technical Lead

Pyspark Module Lead

We are seeking a highly skilled and motivated Data Engineer to join our dynamic ...
Location
Location
India , Noida
Salary
Salary:
Not provided
https://www.soprasteria.com Logo
Sopra Steria
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proficiency in Advanced SQL (Window functions), Spark Architecture, Pyspark or Scala with Spark, Hadoop
  • Proven expertise in designing and deploying data pipelines
  • Strong problem-solving skills and ability to work effectively in a collaborative team environment
  • Excellent communication skills and ability to translate technical concepts to non-technical stakeholders
Job Responsibility
Job Responsibility
  • Work in tandem with Data Scientists to design, develop, and implement machine learning pipelines
  • Utilize PySpark for data processing, transformation, and preparation for model training
  • Leverage AWS EMR and S3 for scalable and efficient data storage and processing
  • Implement and manage ETL workflows using Streamsets for data ingestion and transformation
  • Design and construct pipelines to deliver high-quality training and inference datasets
  • Collaborate with cross-functional teams to ensure smooth deployment and real-time/near real-time inferencing capabilities
  • Optimize and fine-tune pipelines for performance, scalability, and reliability
  • Ensure IAM policies and permissions are appropriately configured for secure data access and management
  • Implement Spark architecture and optimize Spark jobs for scalable data processing
What we offer
What we offer
  • All positions are open to people with disabilities
  • Commitment to fighting against all forms of discrimination
  • Inclusive and respectful work environment
Read More
Arrow Right

Big Data Lead Developer

We are seeking a highly skilled and experienced Big Data Lead Developer to estab...
Location
Location
Canada , Mississauga
Salary
Salary:
170.00 USD / Year
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 6+ years of relevant experience in Big Data application development or systems analysis role
  • Experience in leading and mentoring big data engineering teams
  • Strong understanding of big data concepts, architectures, and technologies (e.g., Hadoop, PySpark, Hive, Kafka, NoSQL databases)
  • Proficiency in programming languages such as Java, Scala, or Python
  • Excellent problem-solving and analytical skills
  • Strong presentation, communication and interpersonal skills
  • Experience with data warehousing and business intelligence tools
  • Experience with data visualization and reporting
  • Knowledge of cloud-based big data platforms (e.g., AWS EMR, Azure HDInsight, Google Cloud Dataproc)
  • Proficiency in Unix/Linux environments
Job Responsibility
Job Responsibility
  • Lead and mentor a team of big data engineers, fostering a collaborative and high-performing environment
  • Provide technical guidance, code reviews, and support for professional development
  • Design and implement scalable and robust big data architectures and pipelines to handle large volumes of data from various sources
  • Evaluate and select appropriate big data technologies and tools based on project requirements and industry best practices
  • Implement and integrate these technologies into our existing infrastructure
  • Develop and optimize data processing and analysis workflows using technologies such as Spark, Hadoop, Hive, and other relevant tools
  • Implement data quality checks and ensure adherence to data governance policies and procedures
  • Continuously monitor and optimize the performance of big data systems and pipelines to ensure efficient data processing and retrieval
  • Collaborate effectively with cross-functional teams, including data scientists, business analysts, and product managers, to understand their data needs and deliver impactful solutions
  • Stay up to date with the latest advancements in big data technologies and explore new tools and techniques to improve our data infrastructure
What we offer
What we offer
  • Global benefits designed to support your well-being, growth, and work-life balance
  • Fulltime
Read More
Arrow Right

Technical Planning Architect

Job Description
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
genzeon.com Logo
Genzeon
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in supply chain planning, business analysis, or related fields
  • O9 integrations experience & familiarity with PySpark, SQL, Big Data Environments, managing integration pipelines using Airflow
  • Hands-on experience with o9 Solutions or similar advanced planning tools (e.g., SAP IBP, Blue Yonder, Kinaxis)
  • Strong analytical skills with a deep understanding of supply chain processes (demand, supply, inventory, and S&OP)
  • Excellent problem-solving abilities and attention to detail
  • Proficiency in data analysis tools (Excel, SQL,R, Python, or similar)
  • Ability to effectively communicate complex concepts to technical and non-technical audiences
  • Bachelor’s degree in supply chain management, Computers, Business Administration, or related field
Job Responsibility
Job Responsibility
  • Collaborate with cross-functional teams to design and implement scalable supply chain planning solutions
  • Leverage o9’s platform to develop integrated planning processes, including demand forecasting, inventory optimization, and supply planning
  • Engage with stakeholders to gather and document business requirements, ensuring alignment with strategic goals
  • Conduct gap analyses to identify areas of improvement and develop actionable insights
  • Lead supply chain planning initiatives, ensuring timely delivery of high-quality solutions
  • Act as the bridge between business teams and technical teams, translating business needs into system functionalities
  • Analyze current supply chain processes to identify inefficiencies and recommend best practices for optimization
  • Implement key metrics and KPIs to measure and improve supply chain performance
  • Provide training and support to end-users on planning systems and tools
  • Create and maintain documentation, including user guides and standard operating procedures
Read More
Arrow Right

Senior Data Engineering Architect

Location
Location
Poland
Salary
Salary:
Not provided
lingarogroup.com Logo
Lingaro
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven work experience as a Data Engineering Architect or a similar role and strong experience in in the Data & Analytics area
  • Strong understanding of data engineering concepts, including data modeling, ETL processes, data pipelines, and data governance
  • Expertise in designing and implementing scalable and efficient data processing frameworks
  • In-depth knowledge of various data technologies and tools, such as relational databases, NoSQL databases, data lakes, data warehouses, and big data frameworks (e.g., Hadoop, Spark)
  • Experience in selecting and integrating appropriate technologies to meet business requirements and long-term data strategy
  • Ability to work closely with stakeholders to understand business needs and translate them into data engineering solutions
  • Strong analytical and problem-solving skills, with the ability to identify and address complex data engineering challenges
  • Proficiency in Python, PySpark, SQL
  • Familiarity with cloud platforms and services, such as AWS, GCP, or Azure, and experience in designing and implementing data solutions in a cloud environment
  • Knowledge of data governance principles and best practices, including data privacy and security regulations
Job Responsibility
Job Responsibility
  • Collaborate with stakeholders to understand business requirements and translate them into data engineering solutions
  • Design and oversee the overall data architecture and infrastructure, ensuring scalability, performance, security, maintainability, and adherence to industry best practices
  • Define data models and data schemas to meet business needs, considering factors such as data volume, velocity, variety, and veracity
  • Select and integrate appropriate data technologies and tools, such as databases, data lakes, data warehouses, and big data frameworks, to support data processing and analysis
  • Create scalable and efficient data processing frameworks, including ETL (Extract, Transform, Load) processes, data pipelines, and data integration solutions
  • Ensure that data engineering solutions align with the organization's long-term data strategy and goals
  • Evaluate and recommend data governance strategies and practices, including data privacy, security, and compliance measures
  • Collaborate with data scientists, analysts, and other stakeholders to define data requirements and enable effective data analysis and reporting
  • Provide technical guidance and expertise to data engineering teams, promoting best practices and ensuring high-quality deliverables. Support to team throughout the implementation process, answering questions and addressing issues as they arise
  • Oversee the implementation of the solution, ensuring that it is implemented according to the design documents and technical specifications
What we offer
What we offer
  • Stable employment. On the market since 2008, 1500+ talents currently on board in 7 global sites
  • Workation. Enjoy working from inspiring locations in line with our workation policy
  • Great Place to Work® certified employer
  • Flexibility regarding working hours and your preferred form of contract
  • Comprehensive online onboarding program with a “Buddy” from day 1
  • Cooperation with top-tier engineers and experts
  • Unlimited access to the Udemy learning platform from day 1
  • Certificate training programs. Lingarians earn 500+ technology certificates yearly
  • Upskilling support. Capability development programs, Competency Centers, knowledge sharing sessions, community webinars, 110+ training opportunities yearly
  • Grow as we grow as a company. 76% of our managers are internal promotions
Read More
Arrow Right

Data Engineering Architect

Data engineering involves the development of solutions for the collection, trans...
Location
Location
India
Salary
Salary:
Not provided
lingarogroup.com Logo
Lingaro
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years’ experience in the Data & Analytics area
  • 4+ years’ experience into Data Engineering Architecture
  • Proficiency in Python, PySpark, SQL
  • Strong expertise in Azure cloud services such as: ADF, databricks, pyspark, Logic app
  • Strong understanding of data engineering concepts, including data modeling, ETL processes, data pipelines, and data governance
  • Expertise in designing and implementing scalable and efficient data processing frameworks
  • In-depth knowledge of various data technologies and tools, such as relational databases, NoSQL databases, data lakes, data warehouses, and big data frameworks (e.g., Hadoop, Spark)
  • Experience in selecting and integrating appropriate technologies to meet business requirements and long-term data strategy
  • Ability to work closely with stakeholders to understand business needs and translate them into data engineering solutions
  • Strong analytical and problem-solving skills, with the ability to identify and address complex data engineering challenges
Job Responsibility
Job Responsibility
  • Collaborate with stakeholders to understand business requirements and translate them into data engineering solutions
  • Design and oversee the overall data architecture and infrastructure, ensuring scalability, performance, security, maintainability, and adherence to industry best practices
  • Define data models and data schemas to meet business needs, considering factors such as data volume, velocity, variety, and veracity
  • Select and integrate appropriate data technologies and tools, such as databases, data lakes, data warehouses, and big data frameworks, to support data processing and analysis
  • Create scalable and efficient data processing frameworks, including ETL (Extract, Transform, Load) processes, data pipelines, and data integration solutions
  • Ensure that data engineering solutions align with the organization's long-term data strategy and goals
  • Evaluate and recommend data governance strategies and practices, including data privacy, security, and compliance measures
  • Collaborate with data scientists, analysts, and other stakeholders to define data requirements and enable effective data analysis and reporting
  • Provide technical guidance and expertise to data engineering teams, promoting best practices and ensuring high-quality deliverables
  • Support to team throughout the implementation process, answering questions and addressing issues as they arise
What we offer
What we offer
  • Stable employment
  • “Office as an option” model
  • Flexibility regarding working hours and your preferred form of contract
  • Comprehensive online onboarding program with a “Buddy” from day 1
  • Cooperation with top-tier engineers and experts
  • Unlimited access to the Udemy learning platform from day 1
  • Certificate training programs
  • Upskilling support
  • Internal Gallup Certified Strengths Coach to support your growth
  • Grow as we grow as a company
Read More
Arrow Right

Data Architect

We are seeking a talented and experienced Data Architect/ Modeller to join our t...
Location
Location
India , Gurugram
Salary
Salary:
Not provided
https://www.circlek.com Logo
Circle K
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Full-Time bachelor’s or master’s degree in engineering/technology, computer science, information technology, or related fields
  • 10+ years of total experience in data modeling and database design and experience in Retail domain will be added advantage
  • 8+ years of experience in data engineering development and support
  • 3+ years of experience in leading technical team of data engineers and BI engineers
  • Proficiency in data modeling tools such as Erwin, ER/Studio, or similar tools
  • Strong knowledge of Azure cloud infrastructure and development using SQL/Python/PySpark using ADF, Synapse and Databricks
  • Hands-on experience with Azure Data Factory, Azure Synapse Analytics, Azure Analysis Services, Azure Databricks, Blob Storage, Python/PySpark, Logic Apps, Key Vault, and Azure functions
  • Strong communication, interpersonal, collaboration skills along with leadership capabilities
  • Ability to work effectively in a fast-paced, dynamic environment as cloud SME
  • Act as single point of contact for all kinds of data management related queries to make data decisions
Job Responsibility
Job Responsibility
  • Collaborate with solution architect, data engineers, business stakeholders, business analysts, and DQ testers to ensure data management and data governance framework is defined as critical components
  • Design and develop data models using industry-standard modeling techniques and tools
  • Perform data profiling, data lineage and analysis to understand data quality, structure, and relationships
  • Optimize data models for performance, scalability, and usability by creating optimal data storage layer
  • Define and enforce data modeling standards, best practices, and guidelines
  • Participate in data governance initiatives to ensure compliance with data management policies and standards
  • Work closely with database administrators and developers to implement data models in relational and non-relational database systems
  • Conduct data model reviews and provide recommendations for improvements
  • Stay updated on emerging trends and technologies in data modeling and data management
  • Fulltime
Read More
Arrow Right

Senior Data Engineering Manager

Lead and mentor data engineering team to scale data platform, establish best pra...
Location
Location
United States , Work at Home, Illinois
Salary
Salary:
130295.00 - 260590.00 USD / Year
https://www.cvshealth.com/ Logo
CVS Health
Expiration Date
January 19, 2026
Flip Icon
Requirements
Requirements
  • Bachelor's or master's degree in Computer Science, Data Science, or related field
  • 7+ years of experience in data engineering
  • 5+ years in technical leadership or management role
  • Experience building and leading high-performing data engineering teams
  • Deep expertise with cloud platforms (AWS, Azure, or GCP)
  • Experience with big data frameworks (Apache Spark, Hadoop)
  • Experience with data warehousing solutions (Snowflake, Redshift, BigQuery)
  • Experience with workflow orchestration tools (Airflow, Dagster)
  • Solid experience with SQL, Python, PySpark
  • Excellent communication, interpersonal, and leadership skills
Job Responsibility
Job Responsibility
  • Lead, mentor, and grow team of 6-8 data engineers
  • Manage hiring, training, and professional development
  • Conduct performance reviews and provide feedback
  • Own technical vision and roadmap for data platform
  • Lead design, development, and maintenance of data pipelines and data warehouses
  • Drive best practices in data modeling, ETL/ELT processes, and data governance
  • Oversee implementation of new data technologies and architectures
  • Partner with product managers, data scientists, analysts, and other engineering teams
  • Ensure data platform meets standards for performance, security, and data quality
  • Establish culture of operational excellence including monitoring and incident response
What we offer
What we offer
  • Affordable medical plan options
  • 401(k) plan with matching company contributions
  • Employee stock purchase plan
  • Wellness screenings
  • Tobacco cessation and weight management programs
  • Confidential counseling and financial coaching
  • Paid time off
  • Flexible work schedules
  • Family leave
  • Dependent care resources
  • Fulltime
Read More
Arrow Right

Azure Data Engineer

Experience: 3-6+ Years Location: Noida/Gurugram/Remote Skills: PYTHON, PYSPARK...
Location
Location
India , Noida; Gurugram
Salary
Salary:
Not provided
nexgentechsolutions.com Logo
NexGen Tech Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3-6+ Years experience
  • PYTHON
  • PYSPARK
  • SQL
  • AZURE DATA FACTORY
  • DATABRICKS
  • DATA LAKE
  • AZURE FUNCTION
  • DATA PIPELINE
Job Responsibility
Job Responsibility
  • Design and engineer the cloud/big data solutions, develop a modern data analytics lake
  • Develop & maintain data pipelines for batch & stream processing using modern cloud or open source ETL/ELT tools
  • Liaise with business team and technical leads, gather requirements, identify data sources, identify data quality issues, design target data structures, develop pipelines and data processing routines, perform unit testing and support UAT
  • Implement continuous integration, continuous deployment, DevOps practice
  • Create, document, and manage data guidelines, governance, and lineage metrics
  • Technically lead, design and develop distributed, high-throughput, low-latency, highly available data processing and data systems
  • Build monitoring tools for server-side components
  • work cohesively in India-wide distributed team
  • Identify, design, and implement internal process improvements and tools to automate data processing and ensure data integrity while meeting data security standards
  • Build tools for better discovery and consumption of data for various consumption models in the organization – DataMarts, Warehouses, APIs, Ad Hoc Data explorations
  • Fulltime
Read More
Arrow Right
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.