This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Ensure the highest uptime for customers in our SaaS environment
Provision Customer Tenants & Manage SaaS Platform, Memos to the Staging and Production Environments
Infrastructure Management: Design, deploy, and maintain secure and scalable AWS cloud infrastructure using services like EC2, S3, RDS, Lambda, and CloudFormation
Monitoring & Incident Response: Set up monitoring solutions (e.g., CloudWatch, Grafana) to detect, respond, and resolve issues quickly, ensuring uptime and reliability
Cost Optimization: Continuously monitor cloud usage and implement cost-saving strategies such as Reserved Instances, Spot Instances, and resource rightsizing
Backup & Recovery: Implement robust backup and disaster recovery solutions using AWS tools like AWS Backup, S3, and RDS snapshots
Security Compliance: Configure security best practices, including IAM policies, security groups, and encryption, while adhering to organizational compliance standards
Infrastructure as Code (IaC): Use Terraform, CloudFormation, or AWS CDK to provision, update, and manage infrastructure in a consistent and repeatable manner
Automation & Configuration Management: Automate manual processes and system configurations using Ansible, Python, or shell scripting
Containerization & Orchestration: Manage containerized applications using Docker and Kubernetes (EKS) for scaling and efficient deployment
Requirements:
2-5 years of experience in Cloud Operations, Infrastructure Management, or DevOps Engineering
Deep expertise in AWS services (EC2, S3, RDS, VPC, Lambda, IAM, CloudFormation, etc.)
Strong experience with Terraform for infrastructure provisioning and automation
Proficiency in scripting with Python, Bash, or PowerShell for cloud automation
Hands-on experience with monitoring and logging tools (AWS CloudWatch, Prometheus, Datadog, ELK Stack, etc.)
Strong understanding of networking concepts, security best practices, IAM policies, and role-based access control (RBAC)
Experience troubleshooting SaaS application performance, system reliability, and cloud-based service disruptions
Familiarity with containerization technologies (Docker, Kubernetes, AWS ECS, or EKS)
Willingness to work in a 24/7 operational environment with rotational shifts