This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
LearnUpon is looking for a Staff Site Reliability Engineer to join our team in Ireland. This is a flex role, working 1 day per week from LearnUpon's Dublin office. LearnUpon LMS helps organizations train their employees, partners, and customers. Businesses can manage, track, and achieve their unique learning goals — all through a single, powerful solution. As a Staff Engineer in Site Reliability Engineering you will be part of the team responsible for the scale-out of the LearnUpon infrastructure.
Job Responsibility:
Identifying opportunities to improve and scale our infrastructure for performance, observability, maintainability, and cost, by creating innovative solutions
Leading our efforts to build an observability function that incorporates application metrics, application transaction tracking, and event log management
Driving the processes to maintain resilient, scalable and cost-effective infrastructure
Working with other Engineering teams to provide infrastructure solutions that meet their ongoing requirements
Building tools focused on measuring, monitoring and alerting, with an eye towards self-service in order to promote Engineers’ ownership of observability
Reacting quickly to changing customer and business needs
Participate in on-call rota
Mentoring junior talent
Requirements:
7+ years of experience in a software or Ops role
5+ years of cloud engineering experience, with at least 2 years experience with AWS
Experience deploying Microservice environments, using containerisation technologies such as Kubernetes and Docker
Experience in designing and implementing Observability tech stacks
Have championed the benefits of Observability to Engineering teams
Can architect the design of SLO/SLI implementation that balances the needs of different teams
Familiar with cost analysis of Observability metrics gathering, Engineering effort, and tooling
Experience building and supporting large-scale distributed systems that back a consumer app or website with associated requirements of performance, security and disaster recovery
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.