Site Reliability Engineer Job at GetYourGuide (Berlin)

Job Description

As a Site Reliability Engineer you will be part of an empowered full remote team that plays a key part in building, automating and enhancing our cloud and container-based infrastructure. We act as 'engineers for the engineers' helping others understand and leverage the architecture and platform underlying their features. Our technology stack consists of AWS, Kubernetes and Istio. Our aim is to create a reliable platform for running our core services while enabling teams to move fast, take risks and experiment.

Job Responsibility

Build and scale our cloud-based infrastructure including managing our Kubernetes clusters and AWS environment
Ensure the high availability, autoscaling and failure recovery capabilities of production and pre-production systems
Develop custom controllers to automate the management of clusters
Leverage Istio and Envoy to manage service communication and provide network observability
Actively drive initiatives towards better system design and implementation of new technologies
Participate in infrastructure on-call rotations
Champion our operations culture and help the engineering organization deliver highly available services for our customers

Requirements

Availability from 13:00 to 17:00 Central European Standard Time zone (Berlin/Zurich) every day for collaboration with the team
Experience with Kubernetes and running containers at scale
A good, low level understanding of the Linux operating system
Strong coding skills in at least one programming language. Our most used language is Go
Good understanding of distributed systems, networking and container technology
Sufficient grasp of public cloud environments like AWS
Positive, proactive team player who is passionate about their craft and cares about helping the team deliver
You care about monitoring and understanding the state of systems
Problem solver with operations skills that can quickly diagnose and pinpoint issues in a production environment
Excellent written and verbal communication skills in English

Nice to have

Took part in company wide initiatives to improve operational excellence
Extended or contributed to open source components (mainly Kubernetes and Istio or similar tools in compute and networking domain)

What we offer

Annual personal growth budget and mentorship programs for continuous learning and development
Work from anywhere in the world for 40 days per year
Flexible working arrangements to support work-life balance
Opportunities to collaborate and socialize with team members through quarterly team events and yearly company-wide events
Monthly transportation and fitness budget
Discounts for you, your friends, and family on GetYourGuide activities
Language reimbursement program
Health and wellness benefits
Monthly allowance for transport (Deutschland ticket)
Bonuses for successful employee referrals
Company contributions to personal pension plans
30 days per year for telecommuting
20% discount for friends & family on GetYourGuide activities

GetYourGuide - All Job Offers

Select Country

Site Reliability Engineer

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?