This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a Developer Experience Engineer to enhance developer productivity, automation, and infrastructure across our hardware and software teams. You will work at the intersection of DevOps, software engineering, and high-performance computing (HPC), building systems that accelerate chip design, simulation, and AI model deployment in a cloud and on-prem hybrid environment.
Job Responsibility
Develop and maintain automation tools to streamline development, testing, and deployment workflows
Optimize and manage Slurm-based job scheduling for AI workloads, simulation, and chip design workflows
Build observability solutions using Grafana, Prometheus, and OpenTelemetry for monitoring pipelines, infrastructure, and compute clusters
Manage and optimize containerized environments using Docker and Kubernetes to enhance scalability and reproducibility
Enhance build, test, and deployment pipelines with CI/CD tools like GitHub Actions, Jenkins, Buildkite, or Bazel
Develop caching and artifact management systems to reduce build times and improve dependency resolution
Integrate and manage cloud resources (AWS, GCP) for scaling compute, storage, and hybrid workloads
Support security and compliance efforts including secrets management and access control
Document and share best practices for efficient developer tooling and workflows
Requirements
Strong Python skills for automation, scripting, and infrastructure development
Experience with Slurm job scheduling in an HPC or hybrid environment
Hands-on experience with observability and monitoring tools like Prometheus, Grafana, and OpenTelemetry
Expertise with Docker and Kubernetes, including Helm charts and cluster management
Proficiency in modern CI/CD pipeline management with tools like GitHub Actions, Jenkins, or Buildkite
Experience with infrastructure-as-code tools like Terraform or Ansible
Knowledge of cloud infrastructure, compute, and storage optimization on AWS or GCP
Nice to have
Data pipelining for AI/ML workflows using Airflow, Prefect, or Dagster
Build system expertise with Bazel, CMake, or distributed build systems
Secrets management tools such as Vault, SOPS, AWS Secrets Manager, or GCP Secret Manager
AI/ML model training workflows and monitoring GPU-accelerated workloads
Exposure to FPGA or ASIC development environments and workflows
What we offer
Full medical, dental, and vision packages, with generous premium coverage
Housing subsidy of $2,000/month for those living within walking distance of the office
Daily lunch and dinner in our office
Relocation support for those moving to West San Jose
Unlimited compute budget subject to ROI justification