Lead Data Engineer, Citi

Citi

Location:
Canada, Mississauga

Category:
IT - Software Development

Contract Type:
Employment contract

Salary:

Not provided

Save Job

Apply Position

Job Description:

The Fixed income data team is responsible for monetizing data generated by Citi's fixed income businesses and building data analytics tools/services that provide actionable insights with direct impact on revenue. The Lead Data Engineer will be responsible for designing, implementing, and optimizing distributed data processing jobs to handle large-scale data in Hadoop Distributed File System(HDFS) and S3 Storage using Apache Kafka, Flink Java and Flink SQL, Apache Spark and Python. This role requires deep understanding of data engineering principles, proficiency in Java, Python and hands-on experience with Kafka and S3 ecosystems. Developer will collaborate with data engineers, analysts, and business stakeholders to process, transform and drive insights and data driven decisions.

Job Responsibility:

design and implement big data warehouse application to process and transform large datasets
develop ETL Pipelines with Apache Kafka, Flink, Spark, Python for data Ingestion, cleaning, aggregation, and transformations
send data to downstream systems by generating feeds or publishing to Kafka topics
optimize ETL jobs for efficiency, reducing run time and resource usage
fine-tune memory management, caching, and partitioning strategies for Optimal performance
load data from different sources into S3 Storage, ensuring data accuracy and integrity
troubleshoot and debug Kafka Job failures, monitor job logs, and Kafka UI Manager to Identify Issues
coding vulnerabilities identification and addressing
enforce coding standards to eliminate code vulnerabilities
adhere to Big Data best practices including small files elimination, Hive SRE scan success, and archival implementation for ideal architecture utilization

Requirements:

8+ years of relevant experience in Hadoop Distributed File System(HDFS) using Apache Spark, Python, Java and SQL
2+ years of relevant experience in S3 Storage using Apache Kafka, Flink Java and Flink SQL with minimal latency, monitor and optimize the performance of Kafka clusters
troubleshoot and resolve issues related to Kafka and data processing
implement best practices for Kafka architecture and operations
experience in systems analysis and programming of software applications
experience in managing and implementing successful projects
working knowledge of consulting/project management techniques/methods
ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
strong communication skills and attention to detail and accuracy
demonstrated leadership skills
basic knowledge of industry practices and standards
consistently demonstrates clear and concise written and verbal communication

Nice to have:

prior financial industry experience

Additional Information:

Job Posted:
May 23, 2025

Employment Type:

Fulltime

Work Type:

Hybrid work

View All Jobs In This Company

Job Link Share:

Lead Data Engineer