This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Working on a team of professionals, you will manage, configure, and support observability tooling for Teradata’s product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud). You will define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed. You will build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences. You will monitor all the layers of Teradata’s application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations. You will constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata’s observability tooling.
Job Responsibility:
Manage, configure, and support observability tooling for Teradata’s product offerings across all three major cloud service providers (AWS, Azure, and Google Cloud)
Define, configure, and deploy monitoring to measure performance, scalability, reliability, and resiliency, and alert when critical thresholds are crossed
Build concise, impactful dashboards displaying infrastructure-level and application-level telemetry for both internal and external audiences
Monitor all the layers of Teradata’s application stack, from the customer-facing interface all the way through the backend, including all services, network layers, databases, and cloud service provider integrations
Constantly seek to reduce mean-time-to-discover and mean-time-to-recover through improvements to Teradata’s observability tooling
Work closely with product engineering and cloud operations personnel to help administer all aspects of Teradata’s observability tooling in pre-production and production environments
Work with security and compliance teams to help provide evidence necessary to meet Teradata’s compliance obligations
Requirements:
Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three
2+ years of administrative-level experience with Grafana or an equivalent observability tool, including but not limited to onboarding users
defining group policies
authoring monitors, alerts, and dashboards
and integration with other enterprise applications such as ServiceNow
Experience with an infrastructure-as-code (IaC) cloud provisioning tool, preferably Terraform
Strong scripting skills with a modern programming language such as Python
Experience with a configuration management tool such as Ansible or Puppet
Experience with a build/deployment automation tool such as Jenkins or Bamboo
Experience with at least one modern source control tool, preferably Git
Experience with at least one modern defect tracking tool, preferably Jira
Familiarity with both SQL and noSQL databases, and use cases for each
Experience administering Linux-based systems
3 to 4 years of experience in the software industry in a devops or site reliability engineering role
An in-depth understanding of infrastructure-level and application-level monitoring principles and practice, across both production and non-production environments
An understanding of enterprise software deployment and security/compliance principles
Proficiency with multi-layered technical troubleshooting and root-cause analysis
The ability to quickly and comprehensively decompose a problem, identifying dependencies and defining tasks
The ability to work both independently and collaboratively in a fast-paced environment, and adjust as priorities change
The flexibility to work on a globally-distributed team managed from the United States