Senior Cloud SRE Job at Hazelcast

Job Description

We are looking for an SRE, experienced in distributed systems, Kubernetes & microservices to join our Applications team. The team focuses on providing tooling to enrich the core Hazelcast Platform, making it easier to use, scale and provide greater functionality. Ensuring solutions to meet the most demanding customer needs. Day to day, you’ll be leveraging your solid engineering fundamentals with a focus on performance, consistency, resilience and scale, bringing your passion for solving difficult problems to help realize the product vision. Your role as a SRE is crucial in ensuring that Hazelcast Platform meets business objectives, is robust and scalable, and is depended upon by customers for mission-critical implementations.

Job Responsibility

Keep Hazelcast cloud-based production systems running smoothly 24/7/365
Design, develop, and maintain our cloud infrastructure to support both our end user management center and microservice based platform
Implement new solutions using AWS and terraform, improving scalability, throughput, and reliability
Support and manage our Keycloak IDP ensuring it provides appropriate security while meeting the needs of the development team
Implement security measures to protect data integrity and confidentiality, including encryption, access control, and compliance with relevant regulations
Work with our operations team to maintain our SOC2 & ISO27001 compliance, and keeping our environment secure
Monitor the system for performance issues, errors, and potential failures, and implement maintenance procedures such as backups, data recovery, and disaster recovery plans
Troubleshoot issues related to data storage, including performance bottlenecks, data corruption, or compatibility issues with other software components
Collaborate with cross-functional teams, including software developers, architects, and product managers, to ensure the effective integration and operation of the components within the overall software infrastructure
Document design decisions, implementation details, and operational procedures to facilitate collaboration among team members and ensure the maintainability of the system
Stay updated with the latest developments in storage technologies, Java programming language, and software engineering best practices, and apply this knowledge to improve existing storage systems and develop new solutions
On-call participation
Be part of our on-call rotation to respond to availability incidents and work with support and engineers on customer incidents

Requirements

Experience of distributed systems, Kubernetes & microservices
Infrastructure as Code (Terraform)
Modern devops stack (K8s, Prometheus, Grafana, Opentelemetry, ArgoCD, helm)
Experience with at least one programming languages, preferably Golang or Python
Experience with CI and building CD pipelines (Jenkins, GitHub Actions)
A passion for automation and keeping our software delivery fast and efficient
Bachelor's degree in a relevant field of study (Computer Science, or related discipline) OR equivalent experience

Nice to have

Mutli-cloud (AWS, GCP and/or Azure)
Experience working with software engineers in designing cloud-native applications or troubleshooting them
Experience as part of an on-call rota

What we offer

25 days annual leave + Bank holidays
Group Company Pension Plan
Private Medical Insurance
Private Dental Insurance
Life Insurance
EAP (Employee Assistance Program)

Hazelcast - All Job Offers

Select Country

Senior Cloud SRE

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

Senior Cloud SRE

Senior Site Reliability Engineer (SRE) – Cloud & Distributed Systems

Senior Cloud Engineer – Observability & Performance Engineering

Senior Cloud Engineer

Senior Cloud Platform Engineer

Senior Cloud Platform Engineer with AI Enablement

Senior SRE Manager

Senior SRE (Cortex)

Senior Cloud Security Engineer

Our AI answers in your language