This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Site Reliability Engineer, you will play a key role in ensuring our systems remain reliable, available, and performant for both our customers and internal teams. Your expertise will directly impact our users' experience and the success of our business. In this role, you'll collaborate closely with our product development and platform engineering teams to build scalable systems and create robust automation that supports our company's goals. Your day-to-day work will make a meaningful difference in how efficiently and effectively our technology operates. We're looking for someone who has hands-on experience with technologies like AWS, CDN, Terraform, Packer, and Splunk. Keen troubleshooting abilities will be essential as you identify and solve complex issues in the critical applications our customers rely on daily. The ideal candidate thrives on learning new technologies and approaches challenges with enthusiasm. You'll be joining a collaborative environment where your problem-solving skills will shine as you work across multiple teams. If you're self-motivated, passionate about quality, and ready to make an impact, we want to hear from you!
Job Responsibility:
Collaborate with development teams to implement and deploy new features that meet high standards for reliability, security, and performance
Partner with cross-functional teams to establish and enhance enterprise standards and best practices
Develop and maintain effective monitoring tools, alerts, and dashboards that provide clear visibility into system health and performance
Analyze metrics and logs to proactively detect anomalies, optimize performance, plan capacity, and isolate issues before customer impact occurs
Identify innovative solutions to complex problems and implement corrective actions decisively
Mentor junior team members while documenting and sharing solutions to build team knowledge
Requirements:
Minimum 5 years' experience in DevOps engineering roles such as SRE, DevOps, CloudOps
Advanced proficiency with Terraform for infrastructure as code implementation (required)
Extensive experience with AWS technologies and services, including EC2, S3, RDS, and IAM (required)
Comprehensive understanding of HTTP protocols, web server technologies, and troubleshooting
Strong experience with load balancing solutions such as AWS ELB, NGINX, or HAProxy
Practical knowledge of caching technologies and CDN implementations
Working experience with Redis for in-memory data storage and caching
Demonstrated ability implementing and optimizing CDN solutions for global content delivery (Preferred)
Expertise in monitoring and troubleshooting web application performance and availability
Practical experience with observability solutions such as Splunk, Datadog, or similar
Proficiency in one or more languages such as Java, Go, Python, or Linux Shell
Proven experience operating effectively in an agile software development environment
Strong understanding of AWS pricing/cost models across compute, storage, and database offerings
Experience implementing and maintaining CI/CD pipelines
Ability to multitask and adapt to changing priorities in a fast-paced, 24x7 environment
Collaborative approach to working with cross-functional teams of both technical and business professionals
Excellent communication, problem-solving, and customer service skills
Bachelor's degree in computer science, science, engineering or equivalent technical certifications preferred
Nice to have:
Demonstrated ability implementing and optimizing CDN solutions for global content delivery