This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
HPE Operations is our innovative IT services organization. It provides the expertise to advise, integrate, and accelerate our customers' outcomes from their digital transformation. Our teams collaborate to transform insight into innovation. In today's fast paced, hybrid IT world, being at business speed means overcoming IT complexity to match the speed of actions to the speed of opportunities.
Job Responsibility:
Review and Validate HPC solutions and Environment through POCs and Benchmarking
Architecting and designing HPC solutions tailored to the customer's needs
Overseeing solution implementation, integration and testing
Diagnose and correct solution issues during the implementation
Providing training, documentation and ongoing support
Maintain the Life-cycle management of the HPC environment
Oversee the team operations and deliverables
Lead the team with technical expertise ensure regular technical session and case reviews
Demonstrate high level of technical & communication skills under critical situations
Takes responsibility for end-to-end problem ownership and its solutions
Should be a good team player
Requirements:
8-12 years of experience with different flavours of Linux like SLES, RHEL and Ubuntu/Debian
5-8 years experience in managing HPC/Linux clusters with good understanding of its architecture
Skilled in installation and configuration of various applications on Linux
Install, administer, and maintain hardware, system software, networking, accounts, and security measures on VMWare configuration
Diagnose and resolve system issues and performance issues
Experience in drafting technical SOPs, action plans and knowledge documents
Good understanding of different cloud platforms
Reinstate integrity of system as quickly as possible following an outage
Triage and solve user-submitted tickets
Track resource usage using monitoring and queuing software
Demonstrated expertise with Linux system administration including OS, networking, storage, Docker and security
Experience with high-speed networking such as InfiniBand and 10/40 Gigabit Ethernet
Familiarity with large storage systems (Scality, Weka, Lustre, GPFS, others)
Experience with HPC clusters manager (HPCM, Bright Cluster Manager)
Experience in server hardware patching and troubleshooting
Experience managing HPC clusters and GPUs
Experience using and supporting job schedulers such as SLURM, PBS or other schedulers
Familiar with Shell/python scripting and Ansible
Familiar with monitoring tools like Grafana/Nagios/Opsramp
Familiar with virtualization technologies like KVM, VMWare, vCenter
Infrastructure Monitoring: Nagios, OpsRamp, HPE PCM, NVIDIA BCM, Solar Winds
Virtualization: Containers, Kubernetes, Vmware and OpenShift
Demonstrate strong written and verbal communication skills
Ability to lead and guide the team while serving as SPOC
Ability to work in a 24x7 environment in rotation shifts
Nice to have:
Peer assistance is an added trait
What we offer:
Health & Wellbeing benefits
Personal & Professional Development programs
Unconditional Inclusion environment
Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
Welcome to CrawlJobs.com – Your Global Job Discovery Platform
At CrawlJobs.com, we simplify finding your next career opportunity by bringing job listings directly to you from all corners of the web. Using cutting-edge AI and web-crawling technologies, we gather and curate job offers from various sources across the globe, ensuring you have access to the most up-to-date job listings in one place.
We use cookies to enhance your experience, analyze traffic, and serve personalized content. By clicking “Accept”, you agree to the use of cookies.