This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Engineer the future of global finance. At Citi, our Tech team doesn’t just support finance – we are helping to redefine it. Every day, $5 trillion crosses through our network. We do business in 180+ countries operating at a scale few can match. From deploying advanced AI to helping shape global markets, we build systems that matter. Look to join a team where your work helps influence economies, your ideas can drive innovation and outcomes, and your growth is backed by mentorship, continuous learning and flexibility with potential hybrid work opportunities. Help solve real-world challenges that touch millions and get the opportunity to build the future of finance with Citi Tech. We are seeking a motivated individual contributor to work in our AI and DevOps Platform Support team in EMEA. This role is responsible for ensuring the stability, reliability, and performance of our critical AI and DevOps platforms. The team supports a wide range of services, including multiple AI applications, developer tools, and CI/CD pipeline technologies used by teams across the organization. The ideal candidate will manage incident and problem resolution and collaborate with engineering and development teams to improve platform services and supportability. Involved in short- to medium-term planning of actions and resources for own area.
Job Responsibility:
Ensuring the stability, reliability, and performance of our critical AI and DevOps platforms
Manage incident and problem resolution and collaborate with engineering and development teams to improve platform services and supportability
Vendor relationship management including oversight for all offshore managed service
Improve the service level the team provides to our end users, which includes maximizing operational efficiencies, strengthening incident management, problem management and knowledge sharing practices
Guide development teams on application stability and supportability improvements
Formulate and implement a framework for managing capacity, throughput and latency
Define and implemented application on-boarding guidelines and standards
Work with various team members on coaching them on how to maximize their potential
Drives continued cost reductions and efficiencies across the portfolios supported
Evaluates subordinates' performance and makes decisions on pay increases, hiring, terminations and other personnel actions
Participates in business review meetings, relating technology tools strategies to business requirements
Assures adherence to all support process and tool standards
Act as the primary point of contact for platform matters, defining the vision and roadmap
Champion the platform's resilience strategy by planning and executing wargaming scenarios, chaos engineering tests, and disaster recovery drills
Drive a comprehensive automation strategy to reduce manual toil, improve deployment velocity, and identify opportunities to leverage AI for operational intelligence
Provides in-depth analysis with interpretive thinking to define problems and develop innovative solutions
Solves the highest impact, highest profile problems with significant impact
Develop and implement AI-powered solutions to automate routine support tasks, predict system failures, and optimize resource utilization
Requirements:
Project management with demonstrable results in improving IT services
Capacity Planning/Forecasting exposure a plus
Ability to plan and organize workload
Consistently demonstrates clear and concise written and verbal communication skills
Excellent analytical and problem-solving skills, with the ability to thrive in a fast-paced support role
Strong communication skills and the ability to explain complex technical concepts to diverse audiences
A strong track record of developing and executing a strategic roadmap for a technical platform, balancing new features with a dedicated 'book of work' for stability
Demonstrable experience leading resilience initiatives such as wargaming, disaster recovery planning, and incident response simulation
Demonstrated experience in designing and implementing disaster recovery (DR) plans and conducting resilience tests (e.g., wargaming, failure simulation)
A creative and proactive mindset with a demonstrated ability to identify opportunities for process improvement and automation using AI/ML techniques