This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Join Barclays as a mid‑level Site Reliability Engineer and help us improve the reliability, performance, and visibility of business‑critical pre‑trade platforms in Quantitative Investment Strategies Technology team in Prague. You will build the observability stack (with a focus on the Elastic ecosystem), implement actionable monitoring and alerting, and work closely with engineers, product owners, and business stakeholders to keep our services resilient and measurable.
Job Responsibility:
Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning
Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring
Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience
Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning
Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations
Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth
Requirements:
Hands‑on experience with Elastic Stack (Elasticsearch, Kibana, Logstash/Beats)
Strong understanding of observability & monitoring (metrics, logs, traces, APM)
Experience with defining and configuring dashboards, alerts, and SLI/SLOs