This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Join our Data & AI Platform team as a Site Reliability Engineer (SRE) – Platform Operation. You will support and maintain scalable, resilient, and efficient infrastructure for our Data & AI Platform, ensuring reliable infrastructure availability and enhancing business as usual. You will collaborate closely with Platform Engineers, Architects, Data Engineers, DevOps, and Security teams to maintain and optimize our platforms.
Job Responsibility:
Support, manage, and maintain Azure resources: Azure SQL, Synapse, Data Factory, Databricks, Unity Catalog
Monitor Azure workloads, troubleshoot incidents, alerts, and performance bottlenecks
Implement and manage RBAC, identity & access policies, and compliance controls
Optimize Azure cost and performance using Azure Monitor, DataDog, and Cost Management tools
Automate tasks using PowerShell, Azure CLI, Terraform, and Python
Utilize Git, GitHub Actions, and Airflow for workflow automation
Provide L2/L3 support for data pipelines, reporting, and cloud services
Conduct incident response, root cause analysis (RCA), and proactive issue resolution
Collaborate with Cloud Engineering, Data Engineers, BI Developers, and Cloud Architects
Follow ITSM processes: Incident, Change, and Problem Management
Ensure platform security and compliance with frameworks like MICS
Requirements:
Academic background: Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field (minimum 3 years of experience)
Experience: 5+ years hands-on with cloud platforms (Azure, AWS, GCP), programming (Bash, PowerShell, Terraform, Python, Java), and Infrastructure as Code (IaC)
English language: Professional working proficiency in English and the local language
Tools / software: Deep expertise in Azure, Databricks, Unity Catalog, Kubernetes, Helm, Docker, Power BI, Datadog, Grafana, GitHub, Azure DevOps, ArgoCD, Airflow, SSIS, Power Query, and relational/NoSQL databases
AI experience: Experience supporting enterprise Data & AI platforms
Soft skills: Analytical problem-solving
Effective communication and active listening
Team player with respect for others
Strong troubleshooting and platform monitoring skills