CrawlJobs Logo

Lead Data Engineer

https://www.citi.com/ Logo

Citi

Location Icon

Location:
Canada, Mississauga

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Employment contract

Salary Icon

Salary:

Not provided

Job Description:

The Fixed income data team is responsible for monetizing data generated by Citi's fixed income businesses and building data analytics tools/services that provide actionable insights with direct impact on revenue. The Lead Data Engineer will be responsible for designing, implementing, and optimizing distributed data processing jobs to handle large-scale data in Hadoop Distributed File System(HDFS) and S3 Storage using Apache Kafka, Flink Java and Flink SQL, Apache Spark and Python. This role requires deep understanding of data engineering principles, proficiency in Java, Python and hands-on experience with Kafka and S3 ecosystems. Developer will collaborate with data engineers, analysts, and business stakeholders to process, transform and drive insights and data driven decisions.

Job Responsibility:

  • design and implement big data warehouse application to process and transform large datasets
  • develop ETL Pipelines with Apache Kafka, Flink, Spark, Python for data Ingestion, cleaning, aggregation, and transformations
  • send data to downstream systems by generating feeds or publishing to Kafka topics
  • optimize ETL jobs for efficiency, reducing run time and resource usage
  • fine-tune memory management, caching, and partitioning strategies for Optimal performance
  • load data from different sources into S3 Storage, ensuring data accuracy and integrity
  • troubleshoot and debug Kafka Job failures, monitor job logs, and Kafka UI Manager to Identify Issues
  • coding vulnerabilities identification and addressing
  • enforce coding standards to eliminate code vulnerabilities
  • adhere to Big Data best practices including small files elimination, Hive SRE scan success, and archival implementation for ideal architecture utilization

Requirements:

  • 8+ years of relevant experience in Hadoop Distributed File System(HDFS) using Apache Spark, Python, Java and SQL
  • 2+ years of relevant experience in S3 Storage using Apache Kafka, Flink Java and Flink SQL with minimal latency, monitor and optimize the performance of Kafka clusters
  • troubleshoot and resolve issues related to Kafka and data processing
  • implement best practices for Kafka architecture and operations
  • experience in systems analysis and programming of software applications
  • experience in managing and implementing successful projects
  • working knowledge of consulting/project management techniques/methods
  • ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • strong communication skills and attention to detail and accuracy
  • demonstrated leadership skills
  • basic knowledge of industry practices and standards
  • consistently demonstrates clear and concise written and verbal communication

Nice to have:

prior financial industry experience

Additional Information:

Job Posted:
May 23, 2025

Employment Type:
Fulltime
Work Type:
Hybrid work
Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.