Senior Site Reliability Engineer Job at Bloomreach (Bratislava)

Job Description

We are looking for a dedicated DevOps Engineer to join our Analytics team and manage our in-memory database (IMF) and related services. Our system runs on Google Cloud Platform (GCP) and Kubernetes and integrates with Kafka, MongoDB, and other services. Your job will be to keep our databases and services running smoothly, maintain reliable monitoring, and develop tools and automation for new releases, maintenance, and incident management.

Job Responsibility

Manage and configure our Kubernetes components to ensure they are highly available, reliable, and perform well
Handle incident responses and perform root cause analysis for critical issues
Participate in a 24/7 on-call rotation, with each duty lasting 1 week
Create and maintain scripts and tools using Python and Go to automate operations and reduce manual tasks
Monitor system performance and plan for future scaling
Ensure there are enough resources during peak times
Set up and maintain systems to monitor and log activities, so issues can be detected and addressed early
Ensure our database has reliable backups and efficient tools for quick and smooth recovery
Work closely with other engineers and product managers to ensure successful project delivery
Collaborate with L2 support engineers to ensure seamless operations and effective problem resolution

Requirements

Worked in DevOps or Site Reliability Engineering (SRE) before
Understand basic DevOps principles
Familiar with cloud platforms, especially Google Cloud Platform (GCP)
It’s important to know how to use Kubernetes
Know how to build and maintain CI/CD pipelines in GitLab or similar
Good at automating tasks and scripting with Python, Go, or Shell (for basic Linux tasks and Kubernetes management)
Experienced in handling and resolving incidents
Know how to use monitoring tools such as VictoriaMetrics and Grafana
Familiar with logging tools
Good at analyzing issues and finding solutions
Can communicate well and work well with remote teams
Able to work on your own and manage multiple tasks
Comfortable working in a fast-paced environment

What we offer

A great deal of freedom and trust
We believe in flexible working hours to accommodate your working style
We work virtual-first with several Bloomreach Hubs available across three continents
We organize company events
We encourage and support our employees to engage in volunteering activities - every Bloomreacher can take 5 paid days off to volunteer
People Development Program -- participating in personal development workshops
Our resident communication coach Ivo Večeřa is available
Leader Development Program
Bloomreachers utilize the $1,500 professional education budget on an annual basis
The Employee Assistance Program -- with counselors -- is available for non-work-related challenges
Subscription to Calm - sleep and meditation app
We organize ‘DisConnect’ days where Bloomreachers globally enjoy one additional day off each quarter
We facilitate sports, yoga, and meditation opportunities for each other
Extended parental leave up to 26 calendar weeks for Primary Caregivers
Restricted Stock Units or Stock Options are granted
Everyone gets to participate in the company's success through the company performance bonus
We offer an employee referral bonus of up to $3,000
We reward & celebrate work anniversaries -- Bloomversaries

Bloomreach - All Job Offers

Select Country

Senior Site Reliability Engineer

Job Description

Job Responsibility

Requirements

What we offer

Looking for more opportunities?

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Senior Site Reliability Engineer

Our AI answers in your language