This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Our client, a leading global financial institution, is seeking an experienced Technology Service Manager to oversee major incident management, service restoration, stakeholder communications, and operational resilience across critical banking platforms. The successful candidate will play a key role in ensuring high availability of customer-facing services while driving continuous improvements in incident response and operational excellence.
Job Responsibility
Manage and coordinate major technology incidents impacting customers and business operations
Assess business impact and determine incident priority based on established severity frameworks
Mobilize technical support teams and facilitate incident bridges to drive rapid service restoration
Ensure timely escalation and execution of incident management processes
Coordinate problem management activities and support Root Cause Analysis (RCA) reviews
Act as the primary liaison between technology teams and business stakeholders during incidents
Provide clear and timely updates to senior management, business leaders, risk teams, and operational stakeholders
Facilitate real-time communication channels and business bridges during major incidents
Maintain incident dashboards and reporting mechanisms for ongoing service disruptions
Ensure incidents, outage details, business impacts, and recovery actions are accurately documented
Support trend analysis, service reporting, and continuous improvement initiatives
Capture lessons learned and ensure preventive measures are implemented to reduce recurrence
Maintain knowledge repositories, operational documentation, and recovery procedures
Support initiatives focused on customer experience monitoring and service reliability
Drive automation and workflow improvements within incident management processes
Participate in operational resilience, business continuity, and failure analysis programs
Contribute to innovation initiatives that improve incident response and reduce service impact
Identify operational risks and control weaknesses, ensuring appropriate remediation and escalation
Participate in governance forums and provide operational metrics and compliance reporting
Support adherence to incident, problem, and change management processes
Requirements
Minimum 5 years of experience in Production Support, Site Reliability Engineering (SRE), IT Operations, or Technology Service Management within the banking or financial services sector
Strong experience in Incident Management, Problem Management, and Change Management
Solid understanding of Linux/Unix systems, networking fundamentals, and application architectures
Hands-on experience with monitoring and observability platforms such as Grafana, ELK, AppDynamics, ITRS, BMC, or equivalent tools
Excellent communication and stakeholder management skills
Willingness to work rotational shifts, including day and night shifts
Experience with cloud technologies, Kubernetes, OpenShift, and containerized environments
Scripting or automation experience using Python, Shell, or similar languages
Experience working in highly regulated industries
Exposure to ServiceNow, CI/CD pipelines, and modern SRE practices
Nice to have
Experience with cloud technologies, Kubernetes, OpenShift, and containerized environments
Scripting or automation experience using Python, Shell, or similar languages
Experience working in highly regulated industries
Exposure to ServiceNow, CI/CD pipelines, and modern SRE practices