Site Reliability Engineer Job at NTT DATA (Guadalajara)

Job Description

We are currently seeking a Site Reliability Engineer to join our team in Guadalajara, Jalisco (MX-JAL), Mexico (MX). SRE – Site Reliability Engineer We are currently seeking a Site Reliability Engineer to join our team in GDL, Jalisco (MX-JAL), Mexico (MX). Perform L1.5 activities such as monitoring, deployment, rollback. Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage. Troubleshoot Azure resources, escalate to Level 3 (Software Development Team). Understand the Microsoft Azure Cloud - ideally Azure Fundamentals certified OR Computer Science/Information Systems Management degree. Familiar with PaaS and IaaS - VMs, Storage, EventHub, Service Fabric Cluster (SFC), Azure Kubernetes Service (AKS), CosmosDB, SQL Server, IoT Hub, Databricks, KeyVault, Datalake. Understand the concept of Internet of Things (IoT) - telemetry, ingestion, processing, data storage, reporting. Understand the concept tools - Octopus, Bamboo, Terraform, Azure DevOps, Jenkins, Github, Ansible. Understand the concept of container orchestration platforms (e.g. Kubernetes). Understand the concept of scripts: Powershell, Python. Understand the difference between NoSQL and SQL databases, and how to maintain them. Understand monitoring and logging systems (LogAnalytics, Splunk, ELK, Prometheus, Nagios, Zabbix, etc.). Independent thinker - why does it break, what can I proactively do to fix it. Please note this is a 24/7 operations IT support team, and if is often necessary to rotate shifts, the rotation can be every 1 month or 2, so please do not assume you will only work the standard Monday thru Friday day shift.

Job Responsibility

Perform L1.5 activities such as monitoring, deployment, rollback
Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage
Troubleshoot Azure resources, escalate to Level 3 (Software Development Team)

Requirements

Perform L1.5 activities such as monitoring, deployment, rollback
Monitor the efficiency of the Azure cloud systems to prevent outages and initiate an Incident Management bridge in case of an outage
Troubleshoot Azure resources, escalate to Level 3 (Software Development Team)
Understand the Microsoft Azure Cloud - ideally Azure Fundamentals certified OR Computer Science/Information Systems Management degree
Familiar with PaaS and IaaS - VMs, Storage, EventHub, Service Fabric Cluster (SFC), Azure Kubernetes Service (AKS), CosmosDB, SQL Server, IoT Hub, Databricks, KeyVault, Datalake
Understand the concept of Internet of Things (IoT) - telemetry, ingestion, processing, data storage, reporting
Understand the concept tools - Octopus, Bamboo, Terraform, Azure DevOps, Jenkins, Github, Ansible
Understand the concept of container orchestration platforms (e.g. Kubernetes)
Understand the concept of scripts: Powershell, Python
Understand the difference between NoSQL and SQL databases, and how to maintain them
Understand monitoring and logging systems (LogAnalytics, Splunk, ELK, Prometheus, Nagios, Zabbix, etc.)
Independent thinker - why does it break, what can I proactively do to fix it
Please note this is a 24/7 operations IT support team, and if is often necessary to rotate shifts, the rotation can be every 1 month or 2, so please do not assume you will only work the standard Monday thru Friday day shift

NTT DATA - All Job Offers

Select Country

Site Reliability Engineer

Job Description

Job Responsibility

Requirements

Looking for more opportunities?