Site Reliability Engineer (DevOps) Job at Hewlett Packard Enterprise (Amsterdam)

Job Description

Site Reliability Engineer (DevOps) - Netherlands Mist AI is the AI-native networking solution from HPE Juniper Networking and our Software Engineering team is seeking a Site Reliability Engineer to join our talented team and build high quality technology solutions that revolutionize networking, powered by Artificial Intelligence in the cloud. Mist AI provides services through SaaS applications to many Fortune 100 and Fortune 500 customers. You will take ops projects from concept through to launch. You will be responsible for maintaining and improving the company's production environment for rapid scaling and outstanding performance. You will be responsible to help us keep stellar uptime and reliability. The improvements you implement will be felt by the entire organization. For you to be successful, you need to have a hunger to learn and adapt to new technology quickly. We demand people who are naturally curious, can self-start and share learnings and outcomes effectively with a distributed team. You need to be a builder at heart.

Job Responsibility

Express your passion about infrastructure as code and continuous deployment to build scalable and highly reliable systems
Define and own KPIs around system availability, quality and scale
Partner with our developers and quality engineering teams to automate the monitoring, alerting, availability and scalability of our applications and systems
Ensure system availability and business continuity by implementing redundant servers/services
Manage after-hours infrastructure updates and maintenance
Proactively research and propose the use of new concepts, processes, technologies, and tools
Partner with software developers to create Mist standards for Microservices (APIs, schemas, serialization, data stores and best practices)
Run secure and scalable applications for highly available, multi-region, AWS and GCP deployments
Ship code several times per week
Be a part of our On-Call rotation
Own disaster recovery and business continuity plans

Requirements

An extensive background in developing and operating large-scale cloud-based distributed applications
Direct experience developing/running applications on AWS or Google Cloud
Laser focus and be able to design infrastructure solutions for scalability, reliability, high availability, performance, security, software maintainability, and operational excellence
The ability to 'fix the plane while in flight' (not just support greenfield solutions)
The ability to prioritize existing technical and infrastructure debt, and experience to build and execute a plan to pay it off
Delivering web-scale infrastructure for a global market at high release velocity
A deep understanding of distributed system design and dependency management
Must have solid experience with at least 2 of the languages: Go, Java, Python
10+ years industry experience in managing infrastructure
5 years Kubernetes administration in a large-scale SaaS environment
5 years maintaining production systems on AWS or GCP
3 years in implementing, managing, and monitoring metrics specific to SaaS applications
3 years using infrastructure as code software (eg. Terraform, AWS and Google Cloud Deployment, CloudFormation)
5 years’ experience in continuous integration practices & tools (Jenkins, Travis CI, CircleCI, etc…)
Previous experience of contributing to war rooms and blameless postmortems
Superb communication skills, written and verbal
Experience of working in a true DevOps environment with daily collaborations
Thrives in a fast-paced startup environment where there may be multiple competing priorities
Customer-service mindset
Passion for improvement

Nice to have

Experience with Kafka, Spark, Storm, Cassandra, ElasticSearch, PostgreSQL, Redis, Zookeeper, Nginx, Airflow
Experience of working with or contributing directly to Open Source projects
Understanding and experience of leading/managing technology products
Understand machine learning techniques and tools. Translate business requirements into data models and implement them for scale and production ready systems
Experience of working with failure-based testing
Experience working in a test-driven development environment

What we offer

Health & Wellbeing
Personal & Professional Development
Unconditional Inclusion

Hewlett Packard Enterprise - All Job Offers

Select Country

Site Reliability Engineer (DevOps)

Job Description

Job Responsibility

Requirements

Nice to have

What we offer

Looking for more opportunities?

Site Reliability Engineer (DevOps)

Site Reliability Engineer / DevOps

Senior DevOps / Site Reliability Engineer

Middle Site Reliability Engineer (CDN & DevOps)

Middle Site Reliability Engineer (CDN & DevOps)

DevOps and Site Reliability Engineer

Staff Engineer, Site Reliability Engineer

Senior Site Reliability Engineer - Fleet Reliability

Site Reliability Engineer / Observability Engineer

Our AI answers in your language