Apache Spark Technical Lead Job at Sopra Steria (Noida)

Apache Spark Senior Technical Lead

We are looking for a skilled Data Engineer to build and maintain scalable data p...

Location

India , Chennai

Salary:

Not provided

Sopra Steria

Expiration Date

Until further notice

Requirements

Expertise in Spark and Scala for large-scale data processing
Proficiency in Python and SQL, including hands-on experience with PL/SQL
Experience in data integration/ETL workflows
Strong knowledge of relational databases such as Oracle and PostgreSQL
Excellent verbal and written communication skills
Result-oriented with strong analytical and problem-solving skills
Ability to work in a fast-paced, agile, and collaborative environment

Job Responsibility

Maintain, develop, and design data ingestion and transformation pipelines using Spark and Scala
Process and optimize SQL and PL/SQL queries for large datasets
Integrate and process data from multiple sources for analytics and reporting
Collaborate with BI and Analytics teams to support reporting and dashboard needs
Ensure data quality, performance, and security
Provide innovative solutions to complex business/technical problems
Identify and resolve technical, integration and development issues
Implement best practices, standards and processes to ensure quality of the final product
Responsible for development of features independently
Continuous build, automated testing, and release management

What we offer

Inclusive and respectful work environment
Open to people with disabilities

Fulltime

Technical Lead, Spark (Java)

At Cloudera, we empower people to transform complex data into clear and actionab...

Location

Salary:

Not provided

Cloudera

Expiration Date

Until further notice

Requirements

8-10+ years of professional software development
Experience leading and delivering complex product enhancements
Strong understanding of at least one of the following languages: Java, Scala, Python
Experience with systems design, development
Passionate about programming, clean coding habits, attention to detail, and focus on quality
Strong oral and written communication skills
Strong ability to research and solve problems independently without constant supervision
Open-minded, desire to learn new things and build great products
Experience with distributed systems

Job Responsibility

Design new features for Cloudera’s data engineering experience, and take them from prototypes to leading a team to deliver the feature in production at scale
Contribute to Apache Spark, Livy
Develop new features in Scala/Java/Python on modern platforms
Gain expertise in distributed data processing, from SQL planners and optimizers, to data layout and table formats like Apache Parquet and Iceberg, to fault tolerance in distributed systems
Gain a solid understanding and deep technical knowledge of components across the Cloudera Data Engineering Experience stack, but focusing on Iceberg and Spark, which you can utilize in your daily tasks
Get to work on large-scale distributed systems, from 100s to 1000s of nodes, in production clusters
Debug system-level deployment issues, root cause analysis, perform system test analysis, and resolve failures
Work on improving internal infrastructure
Collaborate with other team members and stakeholders

What we offer

Generous PTO Policy
Support work life balance with Unplugged Days
Flexible WFH Policy
Mental & Physical Wellness programs
Phone and Internet Reimbursement program
Access to Continued Career Development
Comprehensive Benefits and Competitive Packages
Paid Volunteer Time
Employee Resource Groups

Fulltime

Technical Lead, Spark (Java)

At Cloudera, we empower people to transform complex data into clear and actionab...

Location

Salary:

Not provided

Cloudera

Expiration Date

Until further notice

Requirements

8-10+ years of professional software development
Experience leading and delivering complex product enhancements
Strong understanding of at least one of the following languages: Java, Scala, Python
Experience with systems design, development
Passionate about programming, clean coding habits, attention to detail, and focus on quality
Strong oral and written communication skills
Strong ability to research and solve problems independently without constant supervision
Open-minded, desire to learn new things and build great products
Experience with distributed systems

Job Responsibility

Design new features for Cloudera’s data engineering experience, and take them from prototypes to leading a team to deliver the feature in production at scale
Contribute to Apache Spark, Livy
Develop new features in Scala/Java/Python on modern platforms
Gain expertise in distributed data processing, from SQL planners and optimizers, to data layout and table formats like Apache Parquet and Iceberg, to fault tolerance in distributed systems
Gain a solid understanding and deep technical knowledge of components across the Cloudera Data Engineering Experience stack, but focusing on Iceberg and Spark
Get to work on large-scale distributed systems, from 100s to 1000s of nodes, in production clusters
Debug system-level deployment issues, root cause analysis, perform system test analysis, and resolve failures
Work on improving internal infrastructure
Collaborate with other team members and stakeholders

What we offer

Generous PTO Policy
Support work life balance with Unplugged Days
Flexible WFH Policy
Mental & Physical Wellness programs
Phone and Internet Reimbursement program
Access to Continued Career Development
Comprehensive Benefits and Competitive Packages
Paid Volunteer Time
Employee Resource Groups

Fulltime

Senior Technical Lead - Data & AI (Oil & Gas)

At Codvo, we are committed to building scalable, future-ready data platforms tha...

Location

India , Pune

Salary:

Not provided

Codvo AI

Expiration Date

Until further notice

Requirements

Experience: 12+ years in technical roles, with at least 4+ years in a technical leadership or principal engineer capacity, managing cross-functional engineering teams
Cloud Expertise: Expert-level, hands-on knowledge of Microsoft Azure as a primary platform
Data & AI Expertise: Proven mastery of Databricks and Apache Spark (PySpark)
Extensive experience designing and building large-scale ETL/ELT data pipelines
Strong understanding of AI/ML workflows and data requirements for industrial AI use cases
DevOps & IaC Expertise: Deep proficiency with Infrastructure as Code (Terraform, ARM templates)
Hands-on mastery of Kubernetes and containerization (Docker)
Proven experience building and managing complex CI/CD pipelines (Azure DevOps, Jenkins)
Cloud Security: Strong, practical knowledge of cloud network security (VNETs, NSGs, Firewalls) and identity management (Azure Key Vault, IAM)
Industry Knowledge: Experience in the Oil & Gas sector or a similar heavy-industry/critical infrastructure environment

Job Responsibility

Technical Leadership & Mentorship: Lead, mentor, and provide technical direction to a multi-disciplinary team of engineers (Data, DevOps, Security). Foster a culture of high technical standards, collaboration, and continuous improvement
Detailed Solution Design: Partner with the Solutions Architect and Project Manager to decompose high-level architecture into detailed technical designs, actionable tasks, and implementation plans for the engineering team
Hands-on Development: Act as the lead engineer, personally designing and building the most complex components of the solution, including data pipelines, IaC modules, and CI/CD frameworks
Cross-Functional Integration: Drive the seamless integration of diverse systems, including OT data from historians (OSIsoft PI), data platforms (Databricks), cloud infrastructure (Azure), and CI/CD pipelines (Azure DevOps)
Technical Governance & Quality: Enforce technical best practices across all domains. Conduct rigorous code reviews, design reviews, and security assessments. Ensure all solutions adhere to security-by-design principles, CI/CD automation standards, and data governance policies
Stakeholder Collaboration: Translate complex technical concepts and risks to the Project Manager and business stakeholders. Manage technical dependencies and help the PM with accurate effort estimation and milestone tracking
Problem Solving: Serve as the primary escalation point for the team's most complex technical challenges, leading root cause analysis and implementing robust, long-term solutions

Fulltime

Technical Lead - Fixed Income Data Services

This role is for an Application Development Technical lead within the FI Data te...

Location

Canada , Mississauga

Salary:

120800.00 - 170800.00 USD / Year

Citi

Expiration Date

Until further notice

Requirements

6+ years of demonstrable and relevant experience in software development, with at least 3-5 years in a leadership role within a high-performing technical team
Strong understanding of Java, with the ability to guide and review complex solutions
Solid understanding of REST API development, including best practices for design, security, and scalability
Demonstrable experience in driving the creation of reusable, testable, and efficient code with proper error and exception handling, and establishing coding standards
Extensive experience with the design and implementation of cloud-native applications and deployment via Kubernetes / Openshift, including strategic decision-making on cloud architecture
Expertise in big data computation platforms (Flink, Spark, Apache Beam) or big data distribution platforms (Hadoop, Druid, Pinot, Trino, Ignite), and a track record of leading teams leveraging these technologies
Hands-on experience in handling various data structures, and the ability to guide complex data modeling decisions
Proven leadership in establishing and maturing Continuous Integration and Continuous Delivery environments. Familiarity with TeamCity, Sonarqube, and Jenkins
Extensive experience with the SDLC lifecycle and in leading and coaching within an Agile environment (Scrum/Kanban)
Demonstrable leadership in promoting and enforcing engineering best practices: design patterns, coding standards, rigorous code review processes, and comprehensive unit testing strategies (e.g., Mockito, Junit, Pytest)

Job Responsibility

Lead and oversee the design and development of high-performance green-field data analytics products for a Tier 1 bank, ensuring architectural excellence and alignment with business goals
Collaborate strategically with other dev leads in US and Canada, translating complex business requirements into technical roadmaps and fostering a partnership approach to deliver impactful solutions
Drive innovation within the team, encouraging the exploration and implementation of cutting-edge data visualization and analytics solutions
Mentor and guide team members in applying an engineering mindset, fostering deep understanding of use-cases, developing robust estimation techniques for volume and compute velocity, and openly addressing implementation limitations
Lead the evaluation and development of Proof-of-Concepts (POCs) for new strategic initiatives, guiding the team to convert successful prototypes into robust enterprise solutions
Foster a culture of continuous learning and growth within the team, empowering members to research, learn, and recommend emerging technologies
Provide leadership and strategic direction for post-release support, collaborating closely with business, development, and support groups to ensure operational stability and client satisfaction
Manage team performance, including goal setting, performance reviews, career development, and providing regular feedback to foster professional growth
Participate in hiring processes, attracting, interviewing, and onboarding top talent to grow the team's capabilities
Facilitate effective communication within the team and across different stakeholders, ensuring transparency and alignment

Fulltime

Senior Software Engineer - Core Java & Apache Spark

We are hiring an elite Senior Software Engineer to build and scale our core data...

Location

India , Chennai, Pune

Salary:

Not provided

Citi

Expiration Date

Until further notice

Requirements

Core Java & JVM: Expert-level proficiency in Java, including the Collections Framework, Lambdas, and the Java Concurrency API. Demonstrable experience tuning the JVM and troubleshooting memory/GC issues
Apache Spark: Proven, hands-on experience developing, deploying, and tuning complex Spark applications for large-scale data transformation and analysis
Spring Ecosystem: Extensive, practical experience with the Spring Framework, particularly Spring Boot, Spring Data, and Spring Batch in a production environment
Data Structures & Algorithms: Deep understanding of fundamental data structures and algorithms, with a focus on their application in distributed computing and performance-critical systems
Containerization & Cloud-Native: Hands-on experience with Docker for building images and Kubernetes/OpenShift for deploying and managing distributed applications
Database Engineering: Strong command of SQL and relational database design, including transaction management and indexing. Experience with at least one production NoSQL database (MongoDB, Graph DB, etc.)
Architectural Design: Practical application of OOP, SOLID, and DDD principles to build maintainable and scalable systems. You write tests first (TDD) and believe in robust, automated testing

Job Responsibility

Architect & Build: Design and construct high-throughput, low-latency data processing pipelines using Apache Spark and the Spring ecosystem
Performance Engineering: Dive deep into JVM internals, garbage collection tuning, and Spark job optimization to maximize performance and resource efficiency
Distributed Systems Design: Implement scalable, resilient, and transactional architectures leveraging container orchestration (Kubernetes/OpenShift) and distributed data stores
Code & Design Excellence: Champion and enforce best practices in software engineering, including SOLID principles, advanced design patterns, Domain-Driven Design (DDD), and Test-Driven Development (TDD)
Database Mastery: Engineer and optimize data models for both relational and NoSQL databases, ensuring data integrity, performance, and scalability
CI/CD Automation: Own and enhance CI/CD pipelines for automated build, test, and deployment of Java applications and Spark jobs in a containerized environment
Technical Leadership: Lead design and code reviews, mentor junior engineers, and drive the adoption of new technologies and architectural patterns across the team

Fulltime

Enterprise AI Solutions Technical Lead

Act as the technical lead responsible for implementing enterprise customers’ req...

Location

Egypt , Giza

Salary:

Not provided

Vodafone

Expiration Date

Until further notice

Requirements

Bachelor’s or master’s degree in computer science, Data Science, Artificial Intelligence, or a related field
5–7 years of hands-on experience in developing and deploying AI/ML models in production environments
Advanced proficiency in ML libraries and frameworks (e.g., TensorFlow, PyTorch, Scikit-learn)
Strong expertise in building advanced Generative AI (GenAI) and Natural Language Processing (NLP) applications using modern large language models and frameworks (e.g., Hugging Face, Langchain, OpenAI APIs, Agents Frameworks)
Experience with big data processing and distributed computing using Apache Spark or similar technologies
Proficiency in building, deploying, and managing AI, GenAI and big data applications on cloud platforms (e.g., AWS, Azure, GCP)
Experience with MLOps tools and techniques for versioning, CI/CD, monitoring, and scaling models

Job Responsibility

Translate customers’ requirements into actionable data science tasks and implementation plans
Lead the development, training, and deployment of AI models and data pipelines tailored to customer-specific requirements
Ensure models and AI components are scalable, efficient, and production-ready, with smooth integration into customer systems and enterprise platforms
Serve as the subject matter expert (SME) in machine learning, data science, and AI engineering practices
Provide hands-on technical guidance to data scientists, engineers, and MLOps teams, ensuring accurate and effective implementation
Lead technical squads through virtual collaboration, guaranteeing solutions are built to meet both customer expectations and technical standards
Validate model performance, data quality, and technical fit before solution handover
Collaborate with the Enterprise Tech Leads and Product Owners to review and confirm solution readiness based on business impact, feasibility, and model validation results
Provide input on task prioritization based on complexity, risk, and technical dependencies
Work alongside the Enterprise team to ensure implemented AI solutions meet business outcomes and address real customer needs

Technical Lead & Manager, Fixed Income Data Engineering

This role is for a application dev lead within the FI Data team, responsible for...

Location

Canada , Mississauga

Salary:

120800.00 - 170800.00 USD / Year

Citi

Expiration Date

Until further notice

Requirements

6+ years of demonstrable and relevant experience in software development
At least 3-5 years in a leadership role within a high-performing technical team
Strong understanding of Python 3.6
Solid understanding of REST API development
Demonstrable experience in driving the creation of reusable, testable, and efficient code
Extensive experience with the design and implementation of cloud-native applications and deployment via Kubernetes / Openshift
Expertise in big data computation platforms (Flink, Spark, Apache Beam) or big data distribution platforms (Hadoop, Druid, Pinot, Trino, Ignite)
Hands-on experience in handling various data structures
Proven leadership in establishing and maturing Continuous Integration and Continuous Delivery environments
Extensive experience with the SDLC lifecycle and in leading and coaching within an Agile environment (Scrum/Kanban)

Job Responsibility

Lead and oversee the design and development of high-performance green-field data analytics products
Collaborate strategically with other dev leads in US and Canada
Drive innovation within the team
Mentor and guide team members in applying an engineering mindset
Lead the evaluation and development of Proof-of-Concepts (POCs) for new strategic initiatives
Foster a culture of continuous learning and growth within the team
Provide leadership and strategic direction for post-release support
Manage team performance, including goal setting, performance reviews, career development
Participate in hiring processes
Facilitate effective communication within the team and across different stakeholders

Fulltime

Select Country

Apache Spark Technical Lead

Job Description

Job Responsibility

Requirements

Looking for more opportunities?