This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Our Financial client in Toronto is seeking a hands-on SRE Observability SME to provide day-one expertise in improving system reliability, performance, and incident response across complex distributed environments. This is a HYBRID , embedded role working closely with engineering teams to drive observability best practices.
Job Responsibility:
Provide hands-on SRE and observability expertise across applications and infrastructure
Implement and optimize monitoring, alerting, and observability frameworks
Troubleshoot complex performance and reliability issues using metrics, events, logs, and traces (MELT)
Design and build advanced dashboards and visualization solutions
Guide teams on SRE best practices and reliability improvements
Support incident response, root cause analysis, and remediation
Develop creative observability solutions for systems with limited visibility