Site Reliability Engineer

Site Reliability Engineer – Trading

My client are seeking an SRE / Linux Platform Engineer to work on their low late...

Location

United States , New York

Salary:

100000.00 - 200000.00 GBP / Year

Hunter Bond

Expiration Date

Until further notice

Requirements

Strong Linux experience
Experience with Chef, Puppet or Ansible
Experience with Kubernetes, Docker or Podman
Strong CI/CD knowledge
Python scripting experience

Job Responsibility

Platform Engineering: Building, designing and supporting automated solutions for scalable deployment across private and public cloud infrastructure
Low Latency Environment: Supporting a low latency Linux estate within a trading environment
Automation: Working across both project and BAU activities, automating wherever possible
Linux Engineering: Managing and supporting Linux systems
Configuration Management: Working with Chef, Puppet or Ansible
Containers: Supporting Kubernetes, Docker or Podman environments
CI/CD: Building and maintaining CI/CD pipelines
Automation: Writing and maintaining Python scripts

What we offer

Bonus

Fulltime

Site Reliability Engineer – Trading

My client are seeking an SRE / Linux Platform Engineer to work on their low late...

Location

Singapore , Singapore

Salary:

120000.00 - 250000.00 SGD / Year

Hunter Bond

Expiration Date

Until further notice

Requirements

Strong Linux experience
Experience with Chef, Puppet or Ansible
Experience with Kubernetes, Docker or Podman
Strong CI/CD knowledge
Python scripting experience

Job Responsibility

Platform Engineering: Building, designing and supporting automated solutions for scalable deployment across private and public cloud infrastructure
Low Latency: Supporting and optimising a low latency Linux environment
Project & BAU: Working across both project work and BAU activities, automating wherever possible
Linux Engineering: Managing and supporting Linux systems
Configuration Management: Working with Chef, Puppet or Ansible
Containers: Supporting Kubernetes, Docker or Podman environments
CI/CD: Building and maintaining CI/CD pipelines
Automation: Writing and maintaining Python scripts

Fulltime

Trade Floor Site Reliability Engineer

Join us at Barclays as a Trade Floor Site Reliability Engineer, providing real‑t...

Location

United Kingdom , London

Salary:

Not provided

Barclays

Expiration Date

Until further notice

Requirements

Experience in systems engineering, including Linux and Windows, networking, Kubernetes and cloud infrastructure
Proficiency in automation tools
Proficiency in implementing monitoring, alerting and observability for critical trading platforms
The ability to manage incidents effectively, troubleshoot issues swiftly, proactively communicate and perform root cause analysis to prevent future incidents
Prior experience in supporting Credit or any IB asset classes like Rates or Equities or FX
Experience working with PaaS products, including some experience of either virtualization, containerization, orchestration of compute/network/storage

Job Responsibility

Providing real‑time support to Credit EMEA traders and sales teams to keep critical trading platforms stable and performant
Ensuring seamless client service as electronic and algo trading rapidly expand
Provision of technical support for the service management function to resolve more complex issues
Execution of preventative maintenance tasks on hardware and software and utilisation of monitoring tools/metrics
Maintenance of a knowledge base containing detailed documentation
Analysis of system logs, error messages and user reports to identify root causes
Automation, monitoring enhancements, capacity management, resiliency, business continuity management, front office specific support and stakeholder management
Identification and remediation of potential service impacting risks and issues
Proactively assess support activities implementing automations where appropriate

What we offer

Competitive holiday allowance
Life assurance
Private medical care
Pension contribution

Fulltime

Mercor is at the intersection of labor markets and AI research. We partner with ...

Location

United States , San Francisco

Salary:

130000.00 USD / Year

Mercor

Expiration Date

Until further notice

Requirements

Experience doing true SRE work (not just operations) across multiple roles or companies
Deep familiarity with SRE practices as popularized by Google (e.g., error budgets, reliability vs. risk trade-offs, large-scale distributed systems)
5+ years of SRE experience
15+ years of overall experience is ideal for this first SRE hire
Proven success operating systems at scale, with a strong understanding of the challenges of large, distributed production environments
Strong collaboration skills
able to work efficiently with cross-functional engineering teams
Ability to drive cultural change around reliability while remaining hands-on in building and fixing systems
Comfort working in high-intensity, high-availability environments where uptime and production quality are critical

Job Responsibility

Own reliability and production safety for core shared services and customer-facing systems
Partner directly with infrastructure leadership to define SRE priorities, reliability standards, and production safety roadmap
Repair and improve how our production systems are structured so they are stable, resource-efficient, isolated, and well-observed
Introduce and champion modern SRE practices (e.g., incident response, postmortems, SLIs/SLOs) across engineering teams
Collaborate with leverage engineering and applied AI teams to ensure sustainable growth
Represent SRE best practices internally and help teams onboard onto production in a way that is safe, scalable, and consistent with SRE principles

What we offer

Offers Equity

Fulltime

Site Reliability Engineer - RGM

We are looking for a Site Reliability Engineer for RGM to join our dynamic team....

Location

Egypt

Salary:

Not provided

Coca-Cola HBC

Expiration Date

Until further notice

Requirements

Bachelor’s or Master’s degree in Computer Science, Information Technology, Engineering, or a related field
5+ years of experience in Site Reliability Engineering, DevOps, or IT operations with a focus on application reliability and observability
Hands-on experience with complex technology stacks: SAP S/4 (ideally with SAP BTP, Pricing) and/or Trade Promotion Management platforms (Xtel TPM preferred), system interfaces/APIs maintenance, EDI
Promo and revenue growth management processes understanding and experience, preferably in FMCG industry
Strong problem-solving skills and the ability to work under pressure during incidents
Excellent communication and collaboration skills, with the ability to coordinate across cross-functional teams
Fluent in English, with strong written and verbal communication skills

Job Responsibility

Proactive Incident Analysis & Operational Improvements
Complex Troubleshooting & Problem Management
Automation for Efficiency
Observability and Monitoring
Cross-functional Collaboration

What we offer

Coaching and mentoring programs
Development opportunities
Equal opportunity employer
IT Equipment
Work with iconic brands
Supportive team
Volunteering Opportunities
Work from home

Fulltime

Senior Site Reliability Engineer

We are seeking a Senior Site Reliability Engineer to drive the reliability, scal...

Location

United Kingdom , London

Salary:

Not provided

Outsource UK

Expiration Date

Until further notice

Requirements

Bachelor's degree in Computer Science, Engineering, or equivalent experience
7+ years in site reliability, production engineering, or systems engineering roles
Strong understanding of distributed systems, consistency models, failure modes, and fault isolation strategies
Hands-on experience with AWS, GCP, or Azure, including multi-region deployments
Proficiency in Kubernetes and large-scale container orchestration
Programming experience in Go, Python, or Java, building automation or reliability systems
Experience designing and operating CI/CD pipelines with deployment safety guardrails
Proven track record leading high-severity incidents and driving systemic remediation
Excellent interpersonal skills with experience influencing cross-team decisions

Job Responsibility

Identify systemic reliability risks and implement preventative solutions
Define and maintain SLIs, SLOs, and error budgets aligned with business and user outcomes
Lead incident management, post-incident reviews, and remediation planning
Review and advise on system architecture to improve scalability, availability, and fault isolation
Design strategies for high availability, graceful degradation, and disaster recovery across multi-region environments
Quantify trade-offs between performance, cost, and operational risk
Enhance deployment pipelines and implement automation to reduce risk and accelerate delivery
Apply safe deployment patterns such as canary, blue/green, and progressive delivery
Ensure robust rollback and recovery mechanisms
Build and evolve monitoring, logging, and tracing solutions to provide actionable insights

Site Reliability Engineer

Barclays Services Corp. seeks Site Reliability Engineer (SRE), AVP in Whippany, ...

Location

United States , Whippany

Salary:

150550.00 - 160000.00 USD / Year

Barclays

Expiration Date

Until further notice

Requirements

Leverage expertise in markets to maintain and improve the reliability and scalability of electronic trading systems including Connectivity Gateways, Trading Algorithms, and routing engines
Develop and extend internal tools leveraging high level programming languages such as Python in a multi-tiered Linux based environment
Monitor latency, throughput, and system health leveraging industry standard tools such as ITRS, Corvil (or equivalent), and Elastic
Perform daily release management across global e-trading stack providing change management oversight along with release implementation and start of day availability
Collaborate with cross functional teams including the front office (traders), quantitative developers, technology teams, and operations to enable stability and resilient solutions across our businesses
Conduct detail postmortems and lessons learned to drive stability and increase overall Mean Time To Recover (MTTR)
Support Exchange mandatory upgrades and other market events including heightened awareness support during periods of market volatility
Maintain rigorous and concise documentation for operational runbooks and systems support

Job Responsibility

Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning
Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring
Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience
Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning
Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations
Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth

What we offer

Medical, dental and vision coverage
401(k)
Life insurance
Other paid leave for qualifying circumstances
Incentive award
Competitive holiday allowance
Private medical care
Pension contribution

Fulltime

Site Reliability Engineer

Join the APAC Equity Derivatives Flow Vol , Automated Market Making (AMM) Techno...

Location

China , Hong Kong

Salary:

Not provided

Barclays

Expiration Date

Until further notice

Requirements

University Graduate or above in Computer Science or related field
Several years of strong experience in a Production support / Trade floor support role
Minimum 2-3 years Equity Derivatives technology experience in a related front-office application support role within Investment banking
Experience in one or more Programming languages like Python, Scripting in Python/Shell/Perl, Java, and unit testing practices with experience working in a Linux environment
Experience in Windows and UNIX platforms, Oracle PL/SQL, Autosys
Understanding of DevOps concepts and use cases, agile software development methodology
Understanding of higher-level Computer Science concepts such as data structures and algorithms, expertise in computer networks, computer architecture, and operating systems
ITIL knowledge
Equity Derivative product knowledge
Experience in supporting risk management system

Job Responsibility

Availability, performance, and scalability of systems and services through proactive monitoring, maintenance, and capacity planning
Resolution, analysis and response to system outages and disruptions, and implement measures to prevent similar incidents from recurring
Development of tools and scripts to automate operational processes, reducing manual workload, increasing efficiency, and improving system resilience
Monitoring and optimisation of system performance and resource usage, identify and address bottlenecks, and implement best practices for performance tuning
Collaboration with development teams to integrate best practices for reliability, scalability, and performance into the software development lifecycle, and work closely with other teams to ensure smooth and efficient operations
Stay informed of industry technology trends and innovations, and actively contribute to the organization's technology communities to foster a culture of technical excellence and growth

What we offer

Competitive holiday allowance
Life assurance
Private medical care
Pension contribution

Fulltime

Select Country

Site Reliability Engineer – Trading

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

Site Reliability Engineer – Trading

Site Reliability Engineer – Trading

Site Reliability Engineer – Trading

Trade Floor Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer - RGM

Senior Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Our AI answers in your language