This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a Site Reliability Engineer (SRE) to support reliable, high-performing production systems for automotive operations clients. This position focuses on strengthening service stability across edge and cloud environments through automation, observability, and disciplined operational practices. The role works closely with engineering and technical stakeholders to improve uptime, manage incidents, and deploy changes safely in real-time manufacturing settings.
Job Responsibility
Maintain dependable and secure production environments across plant-edge and cloud-based systems, with a focus on uptime, responsiveness, and operational stability
Design, refine, and support monitoring dashboards, alerting frameworks, and operational runbooks using tools such as Prometheus, Grafana, and modern telemetry solutions
Build and manage infrastructure through code using Terraform, applying version control standards, peer reviews, and controlled deployment processes
Create automation scripts and lightweight tools in Bash and Python to streamline routine operations, recovery procedures, backup workflows, and environment setup
Take part in incident response and on-call coverage, troubleshoot service disruptions, coordinate initial communication, and document follow-up actions through blameless reviews
Establish and measure service reliability indicators and objectives, helping stakeholders balance system dependability with release speed and operational risk
Support secure connectivity between factory networks and cloud resources by configuring and maintaining VPNs, routing, private networking, and access controls
Administer and optimize relational or time-series databases, including backup planning, replication, performance tuning, and long-term storage health
Contribute to CI/CD delivery practices by improving deployment pipelines, supporting controlled release strategies, and preparing rollback procedures when needed
Partner with controls, software, and data teams to enable reliable data flow from industrial systems and ensure safe deployment to edge infrastructure
Requirements
Bachelor’s degree in Information Technology, Computer Science, Computer Engineering, or comparable practical experience
At least 5 years of experience supporting production environments in a corporate, startup, or similarly fast-paced technical setting
Hands-on expertise with infrastructure as code, including Terraform, along with experience in cloud platforms and related services
Working knowledge of container technologies such as Docker and orchestration platforms like Kubernetes
Experience supporting live systems, participating in on-call rotations, and contributing to incident reviews and corrective actions
Proficiency with automation and scripting using Bash and Python to reduce manual operational effort
Strong communication skills with the ability to explain technical decisions and tradeoffs to cross-functional or non-technical stakeholders
Willingness and ability to travel to customer or plant locations as business needs require
What we offer
medical, vision, dental, and life and disability insurance