This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
A Senior Infrastructure Engineer, you will play a pivotal role in shaping the foundation of Distyl AI’s platform development. This is a hands-on role for an engineer with both the breadth and depth to build secure, scalable, and easily deployable infrastructure. Your work will ensure our AI platform, Distillery, is reliable across diverse customer-hosted cloud and on-prem environments. You’ll be responsible for more than just keeping systems running—you’ll architect and evolve the core infrastructure and pipelines that enables our AI-native applications to be delivered to Fortune 500 companies. The ideal candidate is an opinionated, pragmatic engineer energized by the opportunity to define how cutting-edge AI applications are deployed in production at a global scale. Strong communication skills and a demonstrated ability to take ownership are essential.
Job Responsibility:
Design & Operate Cloud-Native Infrastructure: Architect and manage resilient, scalable deployments across cloud and hybrid environments, leveraging Kubernetes, serverless frameworks, and modern microservices architectures with the infrastructure that powers them
Evolve Infrastructure as Code: Drive the development of modular Terraform and GitOps configurations that ensure consistency, repeatability, and speed across multi-cloud environments
Advance Automation & CI/CD: Drive our automation strategy by building and refining CI/CD pipelines that enable rapid, reliable deployments. You'll apply modern practices like GitOps and work with a variety of build and deployment tools, such as Gradle, Bazel, Flux, Helm, and GitHub Actions, to reduce operational overhead
Embed Security & Compliance: Integrate IAM, secrets management, and secure service-to-service communication into every layer of the infrastructure
Drive Observability & Reliability: Establish robust monitoring, logging, and alerting practices with tools like Datadog Prometheus, Grafana, and OpenTelemetry to ensure our systems are performant, secure, and enterprise-ready
AgentOps & AI Tooling: Design and build the critical infrastructure for our advanced AI systems. This includes architecting AI toolchains, handling complex agent integrations and agent deployments
Requirements:
6+ years of professional experience in infrastructure, DevOps, or systems engineering roles, with a proven track record of delivering production-grade systems
Deep expertise in cloud platforms (Azure, AWS, or GCP) and container orchestration with Kubernetes
Strong experience with Infrastructure as Code tools (Terraform, Pulumi, etc.) and GitOps workflows
Hands-on experience with CI/CD systems (Flux, ArgoCD, Helm, GitHub Actions, GitLab CI)
Proficiency in Python and Linux for automation and tooling
Solid knowledge of microservices architecture, distributed systems design, networking, security, IAM, and secrets management
Familiarity with modern observability stacks (Prometheus, Grafana, OpenTelemetry, DataDog)
Experience with serverless frameworks and event-driven architectures
A demonstrated ownership mindset—able to take initiative, make informed decisions, and drive projects forward
Bachelor’s degree in Computer Science, Engineering, or equivalent practical experience
Ability to travel 10-20%
What we offer:
100% covered medical, dental, and vision for employees and dependents
401(k) with additional perks (e.g., commuter benefits, in‑office lunch)
Access to state‑of‑the‑art models, generous usage of modern AI tools, and real‑world business problems
Ownership of high‑impact projects across top enterprises
A mission‑driven, fast‑moving culture that prizes curiosity, pragmatism, and excellence