CrawlJobs Logo

Senior Site Reliability Engineer

miniclip.com Logo

Miniclip

Location Icon

Location:
Portugal, Lisbon

Category Icon
Category:
IT - Software Development

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

What will you be doing at Miniclip? Participate in an on-call rotation with the Cloud Engineering team to respond to production incidents and outages. Operate and evolve infrastructure using Infrastructure as Code (Terraform), configuration management tools, and containerized platforms on AWS. Build and maintain observability tooling to detect symptoms before they lead to outages. Automate repetitive tasks and processes to reduce operational toil. Collaborate with Engineering and Product teams to design resilient systems that meet performance and reliability goals. Troubleshoot production issues across application, network, and infrastructure layers. Document systems, processes, and runbooks to improve team transparency and onboarding.

Job Responsibility:

  • Participate in an on-call rotation with the Cloud Engineering team to respond to production incidents and outages
  • Operate and evolve infrastructure using Infrastructure as Code (Terraform), configuration management tools, and containerized platforms on AWS
  • Build and maintain observability tooling to detect symptoms before they lead to outages
  • Automate repetitive tasks and processes to reduce operational toil
  • Collaborate with Engineering and Product teams to design resilient systems that meet performance and reliability goals
  • Troubleshoot production issues across application, network, and infrastructure layers
  • Document systems, processes, and runbooks to improve team transparency and onboarding

Requirements:

  • 5+ years of hands-on experience with AWS in both development and operations contexts
  • Strong Linux system administration skills, including performance tuning and debugging
  • Software development background and strong coding skills in one or more of the following: Go, Python, Ruby
  • Experience with Infrastructure as Code, particularly Terraform
  • Familiarity with CI/CD pipelines and artifact management tools
  • A mindset for resilient systems design, thinking about edge cases, failure modes, and graceful degradation
  • Excellent communication skills in English, both written and spoken
  • Comfortable in a fast-paced environment and adaptable to shifting priorities

Nice to have:

  • Experience with EKS or ECS
  • Familiarity with chaos engineering practices
  • Knowledge of OpenTelemetry or Distributed Tracing Systems
  • Knowledge of Service Level Objectives (SLOs), Service Level Indicator (SLIs)
  • Experience setting up Error Budgets and conducting Post Incident Reviews

Additional Information:

Job Posted:
December 10, 2025

Job Link Share:
Welcome to CrawlJobs.com
Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.