This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are a global team of innovators and pioneers dedicated to shaping the future of observability. At New Relic, we build an intelligent platform that empowers companies to thrive in an AI-first world by giving them unparalleled insight into their complex systems. As we continue to expand our global footprint, we're looking for passionate people to join our mission. If you're ready to help the world's best companies optimize their digital applications, we invite you to explore a career with us!
Job Responsibility:
Architectural Leadership: Drive the design and implementation of internal tools using Golang, specifically focusing on Kubernetes Operators and Controllers to automate resource management
Platform Orchestration: Lead complex infrastructure shifts
Operational Excellence: Take ownership of incident response, authoring comprehensive retrospectives and implementing systemic hardening to prevent recurrence using advanced overcommit strategies
Requirements:
5–7 years in a DevOps, Site Reliability, or Infrastructure Engineering role
Deep internal knowledge of K8s primitives (Deployments, StatefulSets, Rollouts) and experience writing custom Kubernetes Operators
Strong experience building production-grade tools and services in Go, specifically for infrastructure automation
Proven track record of managing production environments, handling high-severity incidents, and improving SLA compliance through automation
Hands-on experience with cloud-native scaling tools (e.g., Karpenter, Cluster API) and managing OS migrations (e.g., Flatcar)
Ability to lead projects as a "Captain," providing technical direction and unblocking team members across different time zones
Nice to have:
CKA Certification: Certified Kubernetes Administrator is highly preferred
AI/ML Infrastructure: Experience with GPU Operator enablement or hosting Small Language Models (SLMs) is a significant plus
Familiarity with Helm, GitOps (ArgoCD/Flux), and Grand Centra
What we offer:
Fostering a diverse, welcoming and inclusive environment
Flexible workforce model (fully office-based, fully remote, or hybrid)