This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a Senior Site Reliability Engineer at Optimizely, you will play a critical role in ensuring the reliability, performance, and scalability of our digital platforms. You will collaborate with cross-functional teams to design, implement, and maintain robust systems and processes that enhance the overall user experience.
Job Responsibility:
Design and implement reliable and scalable systems to support our digital platforms
Collaborate with software engineers to integrate reliability into the architecture
Develop and maintain monitoring solutions to ensure system performance and availability
Identify and resolve performance bottlenecks and optimize system performance
Lead incident response efforts, including troubleshooting, root cause analysis, and implementing corrective actions to prevent future incidents
Develop and maintain automation tools and scripts to improve system efficiency and reduce manual intervention
Implement infrastructure as code practices
Work closely with cross-functional teams to align on reliability goals and best practices
Communicate effectively with stakeholders to provide updates on system status and improvements
Stay updated with the latest industry trends and technologies related to site reliability engineering
Proactively identify opportunities for process and system improvements
Requirements:
Proven experience as a Senior Site Reliability Engineer or similar role in a fast-paced environment
Strong understanding of cloud computing, networking, and system architecture. Preferably AWS, GCP is a plus
Proficiency in scripting and automation tools (e.g., Python, Bash, Terraform, Chef)
Experience with observability tools (e.g., Datadog, Prometheus, Grafana, ELK Stack)
Kubernetes Expertise: Demonstrated experience in designing, deploying, and managing applications in Kubernetes environments. Proficiency in configuring and optimizing Kubernetes clusters for scalability, reliability, and performance. Hands-on experience with Kubernetes tools and technologies such as Helm, Kustomize, and Kubectl
Istio Proficiency (preferred): Familiarity with Istio service mesh architecture and its components is a plus
Experience (preferred) with message broker, preferably Kafka
Understanding (preferred) of coordination services such as Zookeeper
Proficiency in version control software, particularly Git, is required
Excellent problem-solving skills and attention to detail
Strong communication and collaboration skills
Proficiency in English is required.
Nice to have:
GCP is a plus
Familiarity with Istio service mesh architecture and its components is a plus
Experience (preferred) with message broker, preferably Kafka
Understanding (preferred) of coordination services such as Zookeeper