This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Design, develop, and maintain scalable, enterprise-grade AI agents, supporting ELT/ETL processes to handle large data volumes using the Python, FAST API, Microservices, PySpark, Kafka and Databricks ecosystem
Build and Deploy GEN AI Agents using Googles ADK and Google Flash 2.5+ LLMs to support application automation supports and its deep insights, workflow support with HIL - Human in loop architecture
Build and maintain data federation layers for lambda and Data Mesh architectures using tools like Starburst, with a strategy for adopting AI-based use cases (e.g., machine learning, deep learning, NLP) to drive efficiency
Develop, deploy, and automate microservice integrations to support data-intensive applications, ensuring scalability, resilience, and maintainability using cloud native infrastructure and openshift or Kubernates architecture including CI/CD pipelines
Integrate and leverage agentic AI tools (e.g., Devin.AI, Github Copilot) and platforms (e.g., MCP) through advanced prompt engineering to enhance development and operational efficiency
Ensure data quality, integrity, and security throughout the entire data lifecycle
Contribute to the continuous improvement of data engineering processes, standards, and best practices within the team
Appropriately assess risk when business decisions are made, demonstrating consideration for the firm's reputation and safeguarding Citi, its clients, and assets by driving compliance with applicable laws, rules, and regulations. Adhere to Policy, apply sound ethical judgment, and escalate, manage, and report control issues with transparency
Requirements
8+ years of overall experience in large-scale application development with recent mandatory platform for the secure and scalable deployment of AI agents into application contexts
Minimum of 5+ years of proven experience in a Python and pyspark Engineering lead role focused on building enterprise-grade, high-volume ELT/ETL processes using the PySpark and Databricks ecosystem
Hands-on experience with agentic AI development using YAML, JSON, FAST API or Spring boot, Google ADK, LLM integrations, including Devin.AI or Github Copilot, and integrating models via platforms like MCP using advanced prompt engineering
Proven experience developing and automating microservice integrations to support data-intensive applications
Proficiency in at least one programming language commonly used for data analytics, engineering, such as Python or Scala
Strong SQL skills and experience with various relational databases
Deep understanding of data modeling, data warehousing concepts, Data Mesh architecture, and data federation
Excellent communication, collaboration, and problem-solving skills
Bachelor's degree in Computer Science, Engineering, or a related field
Nice to have
Experience with cloud-based Big Data platforms (e.g., Cloudera, Databricks, AWS, Azure, GCP)
Experience with frontend technologies such as Angular or React JS for building data-driven application interfaces
Practical experience applying AI/ML techniques to solve real-world business problems
Familiarity with containerization technologies (e.g., Docker, Kubernetes)
Experience in data engineering within the banking retail products domain (e.g., Cards, Mortgage, Deposits, Wealth Management)
Relevant industry certifications (e.g., AWS Certified Big Data - Specialty, Azure Data Engineer Associate)