This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Barclays Services Corp. seeks Site Reliability Engineer (SRE), AVP in Whippany, NJ (multiple positions available). Apply software engineering techniques, automation, and best practices in incident response, to ensure the reliability, availability, and scalability of the systems, platforms, and technology through them.
Job Responsibility:
Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning
Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring
Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience
Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning
Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations
Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth
Requirements:
Leverage expertise in markets to maintain and improve the reliability and scalability of electronic trading systems including Connectivity Gateways, Trading Algorithms, and routing engines
Develop and extend internal tools leveraging high level programming languages such as Python in a multi-tiered Linux based environment
Monitor latency, throughput, and system health leveraging industry standard tools such as ITRS, Corvil (or equivalent), and Elastic
Perform daily release management across global e-trading stack providing change management oversight along with release implementation and start of day availability
Collaborate with cross functional teams including the front office (traders), quantitative developers, technology teams, and operations to enable stability and resilient solutions across our businesses
Conduct detail postmortems and lessons learned to drive stability and increase overall Mean Time To Recover (MTTR)
Support Exchange mandatory upgrades and other market events including heightened awareness support during periods of market volatility
Maintain rigorous and concise documentation for operational runbooks and systems support