This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Site Reliability Engineer you will be part of an empowered full remote team that plays a key part in building, automating and enhancing our cloud and container-based infrastructure. We act as 'engineers for the engineers' helping others understand and leverage the architecture and platform underlying their features. Our technology stack consists of AWS, Kubernetes and Istio. Our aim is to create a reliable platform for running our core services while enabling teams to move fast, take risks and experiment.
Job Responsibility:
Build and scale our cloud-based infrastructure including managing our Kubernetes clusters and AWS environment
Ensure the high availability, autoscaling and failure recovery capabilities of production and pre-production systems
Develop custom controllers to automate the management of clusters
Leverage Istio and Envoy to manage service communication and provide network observability
Actively drive initiatives towards better system design and implementation of new technologies
Participate in infrastructure on-call rotations
Champion our operations culture and help the engineering organization deliver highly available services for our customers
Requirements:
Availability from 13:00 to 17:00 Central European Standard Time zone (Berlin/Zurich) every day for collaboration with the team
Experience with Kubernetes and running containers at scale
A good, low level understanding of the Linux operating system
Strong coding skills in at least one programming language. Our most used language is Go
Good understanding of distributed systems, networking and container technology
Sufficient grasp of public cloud environments like AWS
Positive, proactive team player who is passionate about their craft and cares about helping the team deliver
You care about monitoring and understanding the state of systems
Problem solver with operations skills that can quickly diagnose and pinpoint issues in a production environment
Excellent written and verbal communication skills in English
Nice to have:
Took part in company wide initiatives to improve operational excellence
Extended or contributed to open source components (mainly Kubernetes and Istio or similar tools in compute and networking domain)
What we offer:
Annual personal growth budget and mentorship programs for continuous learning and development
Work from anywhere in the world for 40 days per year
Flexible working arrangements to support work-life balance
Opportunities to collaborate and socialize with team members through quarterly team events and yearly company-wide events
Monthly transportation and fitness budget
Discounts for you, your friends, and family on GetYourGuide activities
Language reimbursement program
Health and wellness benefits
Monthly allowance for transport (Deutschland ticket)
Bonuses for successful employee referrals
Company contributions to personal pension plans
30 days per year for telecommuting
20% discount for friends & family on GetYourGuide activities
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.