This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
The Fixed income data team is responsible for monetizing data generated by Citi's fixed income businesses and building data analytics tools/services that provide actionable insights with direct impact on revenue. The Lead Data Engineer will be responsible for designing, implementing, and optimizing distributed data processing jobs to handle large-scale data in Hadoop Distributed File System(HDFS) and S3 Storage using Apache Kafka, Flink Java and Flink SQL, Apache Spark and Python. This role requires deep understanding of data engineering principles, proficiency in Java, Python and hands-on experience with Kafka and S3 ecosystems. Developer will collaborate with data engineers, analysts, and business stakeholders to process, transform and drive insights and data driven decisions.
Job Responsibility:
design and implement big data warehouse application to process and transform large datasets
develop ETL Pipelines with Apache Kafka, Flink, Spark, Python for data Ingestion, cleaning, aggregation, and transformations
send data to downstream systems by generating feeds or publishing to Kafka topics
optimize ETL jobs for efficiency, reducing run time and resource usage
fine-tune memory management, caching, and partitioning strategies for Optimal performance
load data from different sources into S3 Storage, ensuring data accuracy and integrity
troubleshoot and debug Kafka Job failures, monitor job logs, and Kafka UI Manager to Identify Issues
coding vulnerabilities identification and addressing
enforce coding standards to eliminate code vulnerabilities
adhere to Big Data best practices including small files elimination, Hive SRE scan success, and archival implementation for ideal architecture utilization
Requirements:
8+ years of relevant experience in Hadoop Distributed File System(HDFS) using Apache Spark, Python, Java and SQL
2+ years of relevant experience in S3 Storage using Apache Kafka, Flink Java and Flink SQL with minimal latency, monitor and optimize the performance of Kafka clusters
troubleshoot and resolve issues related to Kafka and data processing
implement best practices for Kafka architecture and operations
experience in systems analysis and programming of software applications
experience in managing and implementing successful projects
working knowledge of consulting/project management techniques/methods
ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
strong communication skills and attention to detail and accuracy
demonstrated leadership skills
basic knowledge of industry practices and standards
consistently demonstrates clear and concise written and verbal communication
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.