This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We’re seeking a DevOps and Site Reliability Engineer with strong expertise in Microsoft Azure to manage our observability platform and AIOps Automation. The ideal candidate will have extensive hands-on experience with high traffic environments and security automation as well as in-depth platform knowledge.
Job Responsibility:
Observability Platform: Implement and own the Bestway Azure Observability Playbook — building comprehensive dashboards, alert rules, and runbooks using Application Insights, Log Analytics, and KQL
AIOps Automation: Develop intelligent alerting systems that leverage AI/ML to detect early-warning signals — including IP reputation degradation, database saturation trends, and anomalous traffic patterns — before they escalate to incidents
Release Assurance: Define and execute Operational Acceptance Testing (OAT) gates for all Production deployments, ensuring releases meet reliability, performance, and security thresholds before go-live
Infrastructure Hygiene: Conduct periodic audits of the Azure tenant to identify and decommission orphaned or unutilised resources ('Zombie Resources') — directly reducing operational burn rate
IaC & CI/CD: Build and maintain reusable Terraform modules
manage pipeline integrity across GitHub Actions workflows to ensure consistent, reproducible infrastructure deployments with multi-subscription Hub-and-Spoke Networking
Requirements:
6+ years in DevOps or SRE roles
Specific experience managing high-traffic Azure-hosted environments at scale
Mastery of Terraform — including module authoring, remote state management, and workspace strategies for multi-environment deployments
Expert-level KQL (Kusto Query Language) for Log Analytics
Comfortable building custom Azure Monitor Workbooks for operational reporting
Strong security automation experience: passwordless authentication via OIDC, Azure Key Vault integration, and secrets management best practices
In-depth knowledge of Azure Container Apps (ACA), VNet Integration, and Private Endpoint configuration for secure, network-isolated workloads