This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
High Performance Computing, AI and Labs is a critical element of HPE. We are focused on delivering innovative solutions that accelerate our customers’ digital transformation, enabling them to tackle their complex, and data-intensive workloads. Combining deep expertise and the development of the world’s most cutting-edge, high-performance supercomputers, is defining the next era of computing delivering valuable insight & innovation. Join us and redefine what’s next for you.
Job Responsibility:
Develop, test and release Firmware and Driver components for HPC Option cards (InfiniBand, High Speed Ethernet adapters) on Linux, Windows and VMware OS
Qualify HPC Option card components on HPE Server platforms
Handle Level-4 support for HPC Fabrics components. Collaborate with Level3 support, account teams and customers as needed and provide technical assistance on escalated issues
Work closely with partner to ensure product quality requirements and release timelines are met
Collaborate with Engineering teams (Platform, Thermal, Factory, Benchmarking and test teams) on HPC Option card firmware, driver component related issues
Create and contribute to Trainings, Advisories and knowledge base articles on HPC Fabrics Technology
Requirements:
Bachelor's or Master's degree in Computer Science, Information Systems, or equivalent
Typically 2-4 years experience
Experience with Linux, Windows and VMware ESXi platforms
Scripting knowledge (Shell, Perl, Python)
Working knowledge of virtualization environment and Hypervisors
Knowledge on Server hardware architecture, PCIe, NVLink speeds and concepts like NUMA
Hands-on experience with InfiniBand and Ethernet (RoCE) networks
Experience in configuring, troubleshooting and tuning Infiniband and Ethernet networks
Hands-on Experience with network performance testing (RDMA Perftest, iPerf, netperf) and application level testing (NCCL, RCCL)
Knowledge of High Performance Computing (HPC) stack components, infrastructure and scale-out deployments
Ability to work with multiple internal teams and interact with customers/partners
Ability to prioritize and handle multiple tasks simultaneously
Excellent communication, collaboration and interpersonal skills
Ability to work well in team environment, take on challenges, comfortable and effective working on new areas that require experimentation and rapid problem solving
Nice to have:
Experience with GPUs, Compute accelerators, SSD drives and Storage controllers would be a plus
Cloud Architectures, Cross Domain Knowledge, Design Thinking, Development Fundamentals, DevOps, Distributed Computing, Microservices Fluency, Full Stack Development, Security-First Mindset, Solutions Design, Testing & Automation, User Experience (UX)