This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for an SRE, experienced in distributed systems, Kubernetes & microservices to join our Applications team. The team focuses on providing tooling to enrich the core Hazelcast Platform, making it easier to use, scale and provide greater functionality. Ensuring solutions to meet the most demanding customer needs. Day to day, you’ll be leveraging your solid engineering fundamentals with a focus on performance, consistency, resilience and scale, bringing your passion for solving difficult problems to help realize the product vision. Your role as a SRE is crucial in ensuring that Hazelcast Platform meets business objectives, is robust and scalable, and is depended upon by customers for mission-critical implementations.
Job Responsibility:
Keep Hazelcast cloud-based production systems running smoothly 24/7/365
Design, develop, and maintain our cloud infrastructure to support both our end user management center and microservice based platform
Implement new solutions using AWS and terraform, improving scalability, throughput, and reliability
Support and manage our Keycloak IDP ensuring it provides appropriate security while meeting the needs of the development team
Implement security measures to protect data integrity and confidentiality, including encryption, access control, and compliance with relevant regulations
Work with our operations team to maintain our SOC2 & ISO27001 compliance, and keeping our environment secure
Monitor the system for performance issues, errors, and potential failures, and implement maintenance procedures such as backups, data recovery, and disaster recovery plans
Troubleshoot issues related to data storage, including performance bottlenecks, data corruption, or compatibility issues with other software components
Collaborate with cross-functional teams, including software developers, architects, and product managers, to ensure the effective integration and operation of the components within the overall software infrastructure
Document design decisions, implementation details, and operational procedures to facilitate collaboration among team members and ensure the maintainability of the system
Stay updated with the latest developments in storage technologies, Java programming language, and software engineering best practices, and apply this knowledge to improve existing storage systems and develop new solutions
On-call participation
Be part of our on-call rotation to respond to availability incidents and work with support and engineers on customer incidents
Requirements:
Experience of distributed systems, Kubernetes & microservices
Infrastructure as Code (Terraform)
Modern devops stack (K8s, Prometheus, Grafana, Opentelemetry, ArgoCD, helm)
Experience with at least one programming languages, preferably Golang or Python
Experience with CI and building CD pipelines (Jenkins, GitHub Actions)
A passion for automation and keeping our software delivery fast and efficient
Bachelor's degree in a relevant field of study (Computer Science, or related discipline) OR equivalent experience
Nice to have:
Mutli-cloud (AWS, GCP and/or Azure)
Experience working with software engineers in designing cloud-native applications or troubleshooting them