CrawlJobs Logo

Senior HPC Deployment Engineer

https://www.hpe.com/ Logo

Hewlett Packard Enterprise

Location Icon

Location:
Australia , Melbourne

Category Icon
Category:
-

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a High Performance Computer (HPC) Solution Installation and Deployment Engineer, you will be responsible for the installation, configuration, and deployment of HPC systems. You will work closely with clients, project managers, and other technical staff to ensure that HPC solutions meet performance, reliability, and scalability requirements. This role demands a strong understanding of HPC architectures, networks, and software, along with excellent problem-solving skills.

Job Responsibility:

  • Install and configure HPC hardware and software components, including servers, storage, and networking equipment
  • set up and manage high-speed interconnects (e.g., InfiniBand, Ethernet)
  • deploy operating systems, cluster management software, and parallel file systems
  • coordinate with clients and project managers to understand deployment requirements and timelines
  • implement and document HPC deployment processes and best practices
  • perform system testing and validation to ensure optimal performance and reliability
  • provide technical support to clients during the installation and deployment phases
  • conduct training sessions for clients on HPC system usage and maintenance
  • develop and maintain user documentation and guides
  • monitor and analyze system performance to identify and resolve bottlenecks
  • optimize HPC configurations for specific applications and workloads
  • implement performance tuning techniques for hardware and software
  • work closely with hardware and software vendors to troubleshoot and resolve issues
  • collaborate with internal teams to integrate HPC solutions with existing infrastructure
  • communicate effectively with stakeholders to provide updates on project status and technical issues
  • stay updated on the latest HPC technologies and trends
  • recommend improvements to enhance system performance, reliability, and scalability
  • participate in the evaluation and testing of new HPC products and solutions

Requirements:

  • Proven experience in installing, configuring, and deploying HPC systems
  • strong knowledge of HPC architectures, parallel computing, and cluster management
  • proficiency in Linux/Unix operating systems
  • experience with HPC software tools and libraries (e.g., MPI, OpenMP, SLURM, Torque)
  • familiarity with high-speed networking technologies (e.g., InfiniBand, Ethernet)
  • excellent problem-solving skills and attention to detail
  • strong communication and interpersonal skills
  • ability to work independently and as part of a team
  • certifications in relevant technologies (e.g., Red Hat Certified Engineer, Certified HPC Professional)
  • experience with cloud-based HPC solutions
  • knowledge of scripting languages (e.g., Python, Bash)

Nice to have:

  • Certifications in relevant technologies (e.g., Red Hat Certified Engineer, Certified HPC Professional)
  • experience with cloud-based HPC solutions
  • knowledge of scripting languages (e.g., Python, Bash)
What we offer:
  • Comprehensive suite of benefits supporting physical, financial, and emotional wellbeing
  • specific programs for personal and professional development
  • inclusion and flexibility to manage work and personal needs

Additional Information:

Job Posted:
September 10, 2025

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior HPC Deployment Engineer

Senior Research Engineer

The HPE HPC & AI EMEA Research Lab (ERL) is characterized by a unique blend of i...
Location
Location
Germany , Munich, Berlin
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Development experience in compiled languages such as C, C++ or Fortran and experience with interpreted environments such as Python
  • At least a B.Sc. equivalent in a Science, Technology, Engineering or Mathematical discipline
  • Parallel programming experience, with programming models such as OpenMP, MPI, CUDA, OpenACC, HIP, PGAS languages, etc.
  • An understanding of AI/ML frameworks, experience with frameworks such as TensorFlow or PyTorch is highly desirable
  • An interest in system- and data center monitoring and operational data analysis
  • Professional language skills in English and German
Job Responsibility
Job Responsibility
  • Perform world-class research while also shaping products of the future
  • Work with the most esteemed research partners across Europe
  • Enable high performance research software on pre-Exascale and Exascale supercomputers
  • Provide new environments/abstractions to support application developers to build, deploy, and run applications taking advantage of leading-edge hardware at scale
  • Make and operate HPC/AI systems and datacenters in a sustainable way
  • Manage modern data-intensive workloads in high performance environments
What we offer
What we offer
  • Competitive salary and extensive benefits package (pension scheme, insurances, bike and car leasing, and other fringe benefits)
  • Work-life balance (flexible working time and hybrid workplace model, 30 vacation days, four HPE Wellness-Fridays, up to six months paid parental leave)
  • Support for education, training, and career development
  • Diverse and dynamic work environment
Read More
Arrow Right

Senior Linux System Administrator - Support Engineer

Senior Linux System Administrator/System Support Engineer with expertise support...
Location
Location
Australia , Canberra
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree in Computer Science, Information Technology, or related field, or equivalent work experience
  • At least 5 years of hands-on experience managing Linux systems in production environments, including HPC systems
  • Expertise in Linux/Unix operating systems, parallel file systems (Lustre, GPFS), and networking technologies
  • Proficiency in scripting/programming languages (Bash, Python, Perl, C++)
  • Experience with automation/configuration management tools (Ansible, Puppet, Chef, Terraform)
  • Strong understanding of networking concepts (TCP/IP, DNS, DHCP, firewalls, VPNs)
  • Familiarity with monitoring/logging tools (Nagios, Grafana, ELK Stack)
  • Experience with containerization technologies (Docker, Kubernetes)
  • Excellent problem-solving, analytical, and communication skills
  • Demonstrated ability to work independently in multi-technology environments and collaborate across teams
Job Responsibility
Job Responsibility
  • Deploy, configure, maintain, and troubleshoot Linux servers (Red Hat, CentOS, Ubuntu, or others) across physical, virtual, and cloud environments
  • Support, maintain, and optimize HPC systems, including installation, servicing, and advanced technical troubleshooting of hardware/software and parallel file systems
  • Monitor system performance, availability, and security using industry-standard tools and practices
  • Plan and execute upgrades, patches, enhancements, and migrations to ensure systems are current, secure, and optimized
  • Automate system administration tasks using scripting languages and configuration management tools
  • Implement and maintain backup/recovery strategies, disaster recovery plans, and system documentation
  • Collaborate with development, network, and security teams to support application deployments and troubleshoot issues
  • Provide technical consulting, mentoring, and guidance to junior team members
  • Ensure compliance with strict security protocols in sensitive environments
  • Participate in on-call rotation and respond to system incidents and outages
What we offer
What we offer
  • Competitive salary and performance-based bonuses
  • Comprehensive health, dental, and vision insurance
  • Retirement plan options
  • Paid time off and holidays
  • Professional development opportunities
  • Flexible work arrangements
  • Fulltime
Read More
Arrow Right

HPC Principal Federal Technical Consultant

Principal Consultant to join our High-Performance Computing (HPC) team. In this ...
Location
Location
United States
Salary
Salary:
115500.00 - 266000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of professional experience, with at least 3+ in HPC architecture, systems engineering, or large-scale infrastructure design
  • Advanced degree in Computer Science, Engineering, Physics, or related technical field (or equivalent experience)
  • Proven ability to design and deliver complex, multi-vendor HPC solutions at scale
  • Demonstrated ability to independently complete solution implementations and application design deliverables
  • Must be United States Citizen due to the responsibilities and requirements of the role as this will be supporting a Federal site
  • Top Secret Clearance, TS/SCI with Full Scope Polygraph (FSP)
  • Must be willing to travel as the business dictates
  • Expertise in one or more of the following: parallel computing, MPI/OpenMP, GPU acceleration, workload schedulers (Slurm, Altair PBS Pro, Torque/MOAB, etc.), or large-scale data storage systems (Lustre, GPFS, Ceph)
  • Experience with Network boot technologies (PXE or gPXE/Etherboot etc)
  • Storage specific knowledge: LVM, RAID, iSCSI, Disk partitioning (GPT, MBR)
Job Responsibility
Job Responsibility
  • Lead the technical implementation design and delivery of world class scale HPC solutions, from requirements gathering to implementation
  • Provide architectural guidance on compute, storage, networking, and workload management tailored to customer use cases
  • Configure, deploy, and maintain Linux-based HPC clusters, associated storage, and network infrastructure
  • Work in close collaboration with customers on finalizing and deploying HPC software applications, hosting platforms, and management systems that enable customer research and production workloads
  • Provide technical support and troubleshooting for HPC implementation in secure locations
  • Work on both operational support and strategic HPC projects
  • actively participate in customer user group environments
  • Evaluate and implement new tools, middleware, and methodologies to improve operations and service delivery
  • Ensure compliance with enterprise IT security and technology controls
  • Act as principal consultant in customer engagements, often leading cross-functional project teams (including customer staff)
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Senior Systems Engineer HPC

Location
Location
India , Gurgaon
Salary
Salary:
Not provided
rackspace.com Logo
Rackspace
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Engineering, or a related field (equivalent experience may substitute for degree)
  • Minimum of 10 years of systems experience, including at least 5 years working specifically with HPC
  • Strong knowledge of Linux operating systems (e.g., Rocky Linux, Ubuntu) with a fundamental understanding of Linux internals, system administration, and performance tuning
  • Experience building and managing RPM and DEB packages
  • Experience with cluster management tools such as Bright Cluster Manager, OpenHPC stack, or Warewulf
  • Proficiency with job schedulers and resource managers such as Slurm and LSF
  • Strong understanding of Linux networking (e.g., TCP/IP, DNS, routing) and HPC interconnects (e.g., InfiniBand, Ethernet) including performance tuning
  • Knowledge of parallel file systems such as Lustre, Ceph, or GPFS
  • Working knowledge of Linux authentication and directory services such as LDAP and Active Directory
  • Strong experience with DevOps and configuration management tools, including Ansible, Terraform, Jenkins, and Git
Job Responsibility
Job Responsibility
  • System Administration & Maintenance: Install, configure, and maintain HPC clusters (hardware, software, operating systems), perform regular updates/patching, manage user accounts and permissions, and troubleshoot/resolve hardware or software issues
  • Performance & Optimization: Monitor and analyse system and application performance, identify bottlenecks, implement tuning solutions, and profile workloads to improve efficiency
  • Cluster & Resource Management: Manage and optimize job scheduling, resource allocation, and cluster operations using tools such as Slurm, LSF, Bright Cluster Manager / Base Command Manager, OpenHPC, and Warewulf
  • Networking & Interconnects: Configure, manage, and tune Linux networking (TCP/IP, DNS, routing) and high-speed HPC interconnects (InfiniBand, Ethernet) to ensure low-latency, high-bandwidth communication
  • Storage & Data Management: Implement and maintain large-scale storage and parallel file systems (Lustre, Ceph, GPFS), ensure data integrity, manage backups, and support disaster recovery
  • Security & Authentication: Implement security controls, ensure compliance with policies, and manage authentication and directory services such as LDAP and Active Directory
  • DevOps & Automation: Use configuration management and DevOps practices (Ansible, Terraform, Jenkins, Git) to automate deployments, application packaging (RPM/DEB), and system configurations
  • User Support & Collaboration: Provide technical support, documentation, and training to researchers
  • collaborate with scientists, HPC architects, and engineers to align infrastructure with research needs
  • Planning & Innovation: Contribute to the design and planning of HPC infrastructure upgrades, evaluate and recommend hardware/software solutions, and explore cloud-based HPC solutions where applicable
  • Fulltime
Read More
Arrow Right

HPC Principal Federal Technical Consultant

In this role, you will serve as a trusted technical advisor for customers, guidi...
Location
Location
United States
Salary
Salary:
115500.00 - 266000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years of professional experience, with at least 3+ in HPC architecture, systems engineering, or large-scale infrastructure design
  • advanced degree in Computer Science, Engineering, Physics, or related technical field (or equivalent experience)
  • proven ability to design and deliver complex, multi-vendor HPC solutions at scale
  • demonstrated ability to independently complete solution implementations and application design deliverables
  • must be United States Citizen due to the responsibilities and requirements of the role as this will be supporting a Federal site
  • Top Secret Clearance, TS/SCI with Full Scope Polygraph (FSP)
  • must be willing to travel as the business dictates
  • expertise in one or more of the following: parallel computing, MPI/OpenMP, GPU acceleration, workload schedulers (Slurm, Altair PBS Pro, Torque/MOAB, etc.), or large-scale data storage systems (Lustre, GPFS, Ceph)
  • experience with Network boot technologies (PXE or gPXE/Etherboot etc)
  • storage specific knowledge: LVM, RAID, iSCSI, Disk partitioning (GPT, MBR)
Job Responsibility
Job Responsibility
  • Lead the technical implementation design and delivery of world-class scale HPC solutions, from requirements gathering to implementation
  • provide architectural guidance on compute, storage, networking, and workload management tailored to customer use cases
  • configure, deploy, and maintain Linux-based HPC clusters, associated storage, and network infrastructure
  • work in close collaboration with customers on finalizing and deploying HPC software applications, hosting platforms, and management systems that enable customer research and production workloads
  • provide technical support and troubleshooting for HPC implementation in secure locations
  • work on both operational support and strategic HPC projects
  • actively participate in customer user group environments
  • evaluate and implement new tools, middleware, and methodologies to improve operations and service delivery
  • ensure compliance with enterprise IT security and technology controls
  • act as principal consultant in customer engagements, often leading cross-functional project teams
What we offer
What we offer
  • comprehensive suite of benefits that supports physical, financial, and emotional wellbeing
  • programs catered to helping employees reach any career goals
  • inclusive work environment.
  • Fulltime
Read More
Arrow Right

Senior MLOps Engineer

If you’re passionate about scalability, automated deployment, and well-optimized...
Location
Location
Romania , Bucharest
Salary
Salary:
Not provided
it-genetics.com Logo
IT Genetics Romania
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • University degree, preferably in engineering (software, industrial, mechanical, process) or a related field
  • Over 5 years of experience in MLOps or machine learning engineering, with a focus on deploying and managing deep learning models at scale
  • Strong skills in Python, CI/CD pipelines, and ML frameworks (e.g., PyTorch, TensorFlow, OpenCV) for automating and scaling ML workflows
  • Expertise in monitoring and alert automation for ML workflows, including data pipelines, training processes, and model performance (e.g., Prometheus, Grafana)
  • Familiarity with distributed training techniques, multi-GPU strategies, and hardware optimization for deep learning
  • Strong communication and interpersonal skills
Job Responsibility
Job Responsibility
  • Design end-to-end architecture for the automated training of ML models
  • Create data pipelines to build relevant datasets and data annotation flows
  • Monitor ML model performance and data drift
  • Handle versioning, deployment, and integration with the software team
  • Develop and manage CI/CD pipelines for building, testing, and deploying models
  • Apply best practices for model versioning, rollback, and A/B testing to ensure reliable and accurate production releases
  • Set up a robust monitoring system and develop automated alerting solutions to proactively identify issues in data pipelines, model training, validation, and data variation
  • Promote MLOps best practices (Infrastructure as Code, reproducibility, security) and continuously improve internal processes to increase reliability and efficiency
  • Research and implement cutting-edge technologies to improve training efficiency (e.g., distributed training, HPC, multi-GPU strategies) for the research team
  • Explore future MLOps frameworks and GPU-based cloud solutions as part of the scalability roadmap
What we offer
What we offer
  • Meal tickets
  • A place where your voice truly matters
  • Performance bonuses
  • A day off on your birthday
  • Private medical subscription
  • Trainings and learning resources
  • Hybrid work model
  • Bookster subscription
  • A friendly, passionate, and solution-oriented team
  • Opportunities to grow or change your role within the company
Read More
Arrow Right

Senior Linux System Administrator - Support Engineer

We are seeking an experienced Senior Linux System Administrator/System Support E...
Location
Location
Australia , Canberra
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in Computer Science, Information Technology, or related field, or equivalent work experience
  • At least 5 years of hands-on experience managing Linux systems in production environments, including HPC systems
  • Expertise in Linux/Unix operating systems, parallel file systems (Lustre, GPFS), and networking technologies
  • Proficiency in scripting/programming languages (Bash, Python, Perl, C++)
  • Experience with automation/configuration management tools (Ansible, Puppet, Chef, Terraform)
  • Strong understanding of networking concepts (TCP/IP, DNS, DHCP, firewalls, VPNs)
  • Familiarity with monitoring/logging tools (Nagios, Grafana, ELK Stack)
  • Experience with containerization technologies (Docker, Kubernetes)
  • Excellent problem-solving, analytical, and communication skills
  • able to diagnose complex technical problems to root cause
Job Responsibility
Job Responsibility
  • Deploy, configure, maintain, and troubleshoot Linux servers and HPC clusters systems (Red Hat, CentOS, Ubuntu, or others) across physical (primarily), virtual, and cloud environments
  • Support, maintain, and optimize HPC systems, including cluster manager, operating system and network fabric installation, servicing, and advanced technical troubleshooting of hardware/software and parallel file systems (e.g., Lustre, GPFS)
  • Monitor system performance, availability, and security using industry-standard tools and practices
  • ensure compliance with organizational policies and external regulations
  • Plan and execute upgrades, patches, enhancements, and migrations to ensure systems are current, secure, and optimized
  • Automate system administration tasks using scripting languages (Bash, Python, Perl, etc.) and configuration management tools (Ansible, Puppet, Chef, Terraform)
  • Implement and maintain backup/recovery strategies, disaster recovery plans, and system documentation
  • Collaborate with development, network, and security teams to support application deployments and troubleshoot issues, particularly in multi-technology HPC environments
  • Provide technical consulting, mentoring, and guidance to junior team members and contribute to internal knowledge sharing
  • Ensure compliance with strict security protocols in sensitive environments (e.g., government, research)
What we offer
What we offer
  • Health & Wellbeing: comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Personal & Professional Development: specific programs catered to helping you reach any career goals
  • Unconditional Inclusion: unconditionally inclusive in the way we work and celebrate individual uniqueness
Read More
Arrow Right

Senior Software Integration Engineer

2HB Incorporated is seeking a Senior Software Integration Engineer to support it...
Location
Location
United States , Annapolis Junction
Salary
Salary:
Not provided
2hb.com Logo
2HB
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in computer science or related field and have at least six (6) years of demonstrable experience with integrating, installing, configuring, upgrading, compiling, and supporting COTS/GOTS software in a heterogeneous operating system environment
  • OR five (5) years full time Computer Science directly related work that can be substituted for a degree and have six (6) years of demonstrable experience with integrating, installing, configuring, upgrading, compiling, and supporting COTS/GOTS software in a heterogeneous operating system environment
  • Experience using the Linux CLI and Linux tools
  • Experience developing Bash/Python scripts to automate repetitive tasks, deploy test environments, and execute test suites
  • Experience with IaC principles and automation tools including Ansible
  • Experience with DevOps processes and related FOSS toolchains
  • Experience with CI/CD concepts, principles, methodologies, and tools including GitLab
  • Experience integrating metrics and monitoring frameworks including Splunk
  • Experience creating documentation for reporting integration results to relevant stakeholders
  • TS/SCI/Full Scope Polygraph Clearance
Job Responsibility
Job Responsibility
  • Automation and system testing
  • Creation of technical artifacts
  • Coordination of deployment activities in designated HPC Linux environments
  • Fulltime
Read More
Arrow Right