This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Our On-Prem engineering team is responsible for the deployment of Arize in customer environments. In addition to working with customers in defining infrastructure requirements, the team designs and develops software and tooling that enables the management of these systems at large scale. The On-Prem team has grown to be expert in Kubernetes and cloud deployment on GCP, Azure, and AWS as well as dealing with networking and security aspects of on-premise deployments. The team is dynamic and relies on few talented individuals with a high degree of autonomy and initiative. For this role, we are prioritizing candidates who are based in Buenos Aires, Argentina.
Job Responsibility:
Serve as the first point of contact for on-prem customer issues, triaging and resolving infrastructure and platform support requests
Monitor customer's platform health using existing observability tools
Investigate and troubleshoot Kubernetes-based deployments to identify root causes, apply fixes, and escalate when needed
Work hands-on with self-hosted environments to diagnose configuration, networking, and performance issues
Document common issues and resolutions to build out internal runbooks and knowledge base
Requirements:
3+ years of experience in a DevOps, Infrastructure, or technical support role
Basic working knowledge of Kubernetes — comfortable reading logs, describing resources, and understanding pod/service behavior
Familiarity with at least one major cloud provider (AWS, GCP, or Azure)
Strong troubleshooting instincts and ability to systematically diagnose issues
Strong written and verbal communication skills in English
Eagerness to learn in a fast-paced environment with a high degree of autonomy
Nice to have:
Experience working with or supporting enterprise customers
Familiarity with Helm charts and how they're used to manage Kubernetes deployments
Experience with networking concepts like DNS, TLS, proxies, or firewalls