HPC Lead Job at Linux Recruit (London)

Hpc Operations Lead

Lead the systems that power discovery. Behind every breakthrough in modern scien...

Location

United Kingdom , London

Salary:

73000.00 - 82000.00 GBP / Year

Linux Recruit

Expiration Date

Until further notice

Requirements

Proven leadership experience
strong operational awareness
ability to manage complex services with limited resources and competing priorities
ability to work collaboratively across teams
experience with large scale HPC clusters
Linux based systems
workload schedulers such as Slurm
networking with Infiniband
parallel file systems such as GPFS
experience with high performance storage at petabyte scale

Job Responsibility

Play a central role in shaping how research computing services are delivered and evolved
take ownership of the operational performance of a large scale HPC and storage environment
ensure systems are robust, responsive and continuously improving
guide a specialist team
oversee service delivery
act as a key point of connection between technical teams and scientific users
managing incidents and service performance
influencing long term technology direction and strategy
ensuring complex infrastructure remains accessible and usable
engage closely with researchers to understand their needs

Fulltime

HPC Operations Lead

One of Europe’s most exciting research organisations is on the hunt for a Lead E...

Location

United Kingdom , London

Salary:

70000.00 - 80000.00 GBP / Year

Linux Recruit

Expiration Date

Until further notice

Requirements

Knowledge of HPC environments and large-scale storage
Experience leading people and platforms
Ability to communicate with clarity and warmth
Comfortable juggling priorities and working with different stakeholders
Ability to find practical solutions in a fast-moving research setting
Experience in science or biomedical research is beneficial
Curiosity and a collaborative mindset

Job Responsibility

Take ownership of high-performance compute and large-scale storage platforms
Ensure platforms are reliable, responsive, and ready
Work closely with researchers and technology teams
Oversee the HPC service desk
Guide incident response
Help shape the future direction of the platforms
Design and deliver training
Support users
Step into a wider leadership role when required

What we offer

Excellent benefits
Culture that encourages ideas, learning and teamwork

Fulltime

HPC Supercomputer Onsite Administrator Team Lead

HPC Supercomputer Onsite Administrator Team Lead role at Hewlett Packard Enterpr...

Location

United States , Spring

Salary:

105500.00 - 243000.00 USD / Year

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

8+ years of professional experience
Bachelor of Arts/Science or equivalent degree in computer science or related area of study
without a degree, 11+ years of relevant professional experience
Previous experience managing projects and leading small teams (of 3-10)
Installing, troubleshooting and supporting enterprise-level servers, storage, and networking equipment
HPC (High Performance Computing) or other large-scale systems/datacenter experience
Extensive Linux based hardware troubleshooting and diagnostics experience
US Citizenship required
Self-starter who can work independently without supervision
Understanding of architectural dependencies of technologies

Job Responsibility

Report daily to and physically work at the Customer's Site
Accountable for meeting and maintaining customer's SLA (Service Level Agreement)
Engage in technical problem solving across multiple technologies
Own and drive service tickets to ensure timely resolution of system or customer issues
Lead in technical assessment and delivery of specific technical solutions to the customer
Perform and direct team for daily hardware diagnostics and repairs
Verify and implement detailed technical solutions to problems
Maintain good relationships with team members and customers
Collect data to determine customer needs and requirements
Respond to requests for technical information from customers

What we offer

Health & Wellbeing benefits
Personal & Professional Development programs
Unconditional Inclusion policy
Comprehensive benefits suite supporting physical, financial and emotional wellbeing

Fulltime

Senior Software Engineer- ML Network Stack

We are seeking an experienced engineer to join our team that owns the network st...

Location

Israel , Tel Aviv

Salary:

Not provided

Amazon

Expiration Date

Until further notice

Requirements

5+ years of non-internship professional software development experience
5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
3+ years as a mentor, tech lead or leading engineering teams
3+years experience in SW/HW Co-Design

Job Responsibility

Be a senior engineer on a team that builds and maintains the infrastructure that monitors and reports on functionality and performance of massive testing workloads run at scale
Use internal Amazon CI/CD tools, Linux, and public AWS products to automate the delivery of our software to customers, saving developer time
Write Python code that effortlessly spools up large clusters and runs benchmarks and applications for ML and HPC workloads
Use AWS Managed Grafana and Athena to digest the massive amount of performance data generated by these workloads and create dashboards for developers and stakeholders
Invent automatic mechanisms to alert developers to functional and performance regressions so they never reach customers
Manage the complexity of infrastructure that covers many instance types, software stacks, Linux operating systems, cutting-edge releases and make it easy to evolve

Construction Delivery Lead

The Construction Delivery Lead (CDL) forms part of the Construction Delivery Tea...

Location

United Kingdom , Bristol

Salary:

Not provided

Amentum

Expiration Date

Until further notice

Requirements

Management of strategic planning of site set up, construction sequencing, recovery plans and resource allocation
Good working knowledge of commercial principles affecting construction matters
Ability to produce informative, concise reports
Motivational approach and the ability to energise team members by building a climate of trust and understanding
Considerable working knowledge in the delivery of large complex projects
Relevant Degree (or equivalent) in either Civil Construction
Working understanding of the post holder's obligations under CDM Regulations
NEBOSH, SMSTS or IOSH qualification holder

Job Responsibility

Manage and develop their Construction Delivery Managers, to ensure they have a good understanding of site activities and can carry out their role accordingly. Offer support and guidance, along with ensuring that all appropriate training for a CDM is undertaken
Fulfils Line Management responsibilities to Construction Delivery Managers
Oversee all aspects of Health Safety and Environment during construction activities in their allocated area (s), in accordance with the NNB HPC Construction Phase Plan
Undertake and record safety and/or assurance inspections that support the project KPI's. The tool at HPC for this is INSIGHT
Construction Delivery Lead shares the Construction Delivery Manager's responsibilities depending on work area and scope complexity. When nominated by the SCDM, undertakes role of Construction Delivery Manager for specific areas as and when required

What we offer

free single medical cover and digital GP service
family-friendly benefits such as enhanced parental leave pay
free membership of employee assistance and parental programmes
reimbursement towards relevant professional development and memberships
matched-funding
paid volunteering time
charitable donations

Fulltime

Principal Product Manager - Virtualization Architect

Designs, plans, develops, and manages a product or portfolio of virtualization p...

Location

United States , All

Salary:

Not provided

Hewlett Packard Enterprise

Expiration Date

Until further notice

Requirements

Bachelor's degree or equivalent in computer science, engineering or related field of study
10+ years of experience in product management, engineering, or a related technical role, with significant exposure to virtualization platforms and hypervisor technologies
Demonstrated hands-on or architectural familiarity with KVM, QEMU, libvirt, and the broader Linux virtualization stack
Deep technical knowledge of KVM hypervisor architecture, virtual machine lifecycle, vCPU scheduling, memory management (huge pages, NUMA), virtio device emulation, and hardware-assisted virtualization (Intel VT-x/AMD-V, IOMMU)
Strong understanding of virtualized networking (OVS, macvtap, SR-IOV, DPDK) and storage virtualization (virtio, iSCSI, NVMe-oF, Ceph/RBD) as they apply to KVM guest workloads
Familiarity with virtualization management and orchestration ecosystems - including libvirt APIs, oVirt, OpenStack Nova, and KubeVirt - and the ability to define product integration requirements across these layers
Extensive cross-functional leadership skills: ability to drive alignment across engineering, field, and partner organizations on complex, technically ambiguous virtualization platform initiatives
Strong financial and business acumen, including experience building business cases, defining performance metrics, and analyzing competitive positioning for infrastructure software products
Ability to provide product-specific technical training and enablement to sales, partners, and customer-facing teams on KVM and virtualization platform capabilities
Experience engaging with open-source ecosystems and upstream communities (Linux kernel, QEMU, libvirt, oVirt, OpenStack) as a product stakeholder

Job Responsibility

Independently leads end-to-end strategy and operational roadmap for one or more KVM-based virtualization products or a broader virtualization platform portfolio, spanning hypervisor core, management APIs, and guest ecosystem
Defines and drives the virtualization platform value proposition - including performance benchmarks, TCO advantages, and feature differentiation versus VMware, Hyper-V, and other competing hypervisor stacks - to support go-to-market and sales enablement
Synthesizes market and customer requirements (MRDs) by maintaining deep knowledge of enterprise virtualization use cases: VDI, server consolidation, cloud-native workloads, telco NFV/edge, and HPC virtualization
Translates KVM/QEMU/libvirt engineering capabilities into customer-facing requirements and product specifications, ensuring technical feasibility and roadmap alignment with upstream open-source communities (e.g., Linux kernel KVM subsystem, QEMU project)
Guides key stakeholders through all lifecycle phases - from hypervisor feature planning and kernel integration to product launch, sustaining engineering, and platform end-of-life planning
Collaborates across engineering, supply chain, and marketing to optimize product configuration, SKU design, pricing, and go-to-market strategies for virtualization platform offerings
Acts as a subject matter authority on KVM virtualization architecture, providing technical direction to internal teams, enabling sales and partner technical communities, and representing the product externally with customers and at industry forums

What we offer

Health & Wellbeing
Personal & Professional Development
Unconditional Inclusion

Fulltime

Senior Cybersecurity Engineer

Senior Cybersecurity Engineer LOCATION: Eglin AFB, FL JOB STATUS: Full-time C...

Location

United States , Eglin Air Force Base

Salary:

Not provided

Astrion

Expiration Date

Until further notice

Requirements

Master’s Degree (in Computer Science, Cybersecurity or a related field). Relevant experience may be substituted for the degree
10 Years’ total experience, at least 8 of which is in cybersecurity engineering, architecture or R&D infrastructure
Top Secret Clearance with SCI. Eligible for Special Access Program (SAP) access. US Citizenship is required
DoD 8570/8140 IAT Level III (CISSP, CISM, or equivalent). Certifications: Security+, CEH, or other relevant security certifications
Expert-level knowledge of cybersecurity principles, risk management, and secure computing architectures
Hands-on experience with security tools and technologies, such as SIEM, intrusion detection/prevention systems, vulnerability scanners, and endpoint protection solutions. Experience with Host-Based Security System (HBSS), Assured Compliance Assessment Solution (ACAS), Nessus, Tenable.sc, Tenable.io, NNM, LCE, Nessus Manager, Agents, and Scanner
Experience with scripting (Python, PowerShell) and automation tools (Ansible, Chef)
Familiarity with Risk Management Framework (RMF), Authority to Operate (ATO) documentation, and enclave compliance management
Physically able to lift up to 50 lbs
adaptable to fieldwork and hands-on installations

Job Responsibility

Collaborate with network engineers to architect secure network topologies for current and future connected and isolated environments, ensuring security is embedded in the design phase
Design and deploy security solutions for S&T environments that support continuous research, development, and DevSecOps, working closely with network engineers to implement and maintain these solutions
Advise on security planning for long-term initiatives, including SDREN integration and the Weapons Technology Integration Center (WTIC) and other facility projects, in conjunction with network planning efforts
Develop security innovation roadmaps aligned with mission goals and emerging technologies, coordinating with network engineers to ensure alignment with network modernization efforts
Coordinate with facilities, engineering, and network teams to ensure robust infrastructure supports secure research operations, focusing on the security aspects of network hardware/power/cooling needs and structured cabling
Lead security aspects of containerization, virtualization, and orchestration of systems to support laboratory computing, HPC, and edge devices, working with network engineers to implement secure configurations
Engineer multiple S&T networks security architecture in compliance with NIST 800-series, DoD RMF, DISA Security Technical Implementation Guides (STIGs), and cybersecurity best practices, collaborating with network engineers to ensure seamless integration. Review engineering, architecture, and designs to ensure DoD security policies are met
Implement DevSecOps pipelines to automate security scans and CI/CD deployments, working with network engineers to integrate security into existing pipelines
Manage ATO package development and collaborate with ISSMs, network engineers, and cybersecurity stakeholders to ensure compliance. Review and develop RMF Assessment and Authorization (A&A) documentation, e.g. System Security Plans (SSPs), Security Assessment Reports (SARs), and Plans of Action and Milestones (POA&Ms)
Integrate identity management and single sign-on solutions across enclaves and hybrid environments, coordinating with network engineers to implement and maintain these solutions. Analyze and tune HBSS policies for assets during integration test events. Perform verification and troubleshooting across all HBSS modules. Install updates to HBSS software as released and in compliance with STIG requirements. Monitor HBSS software to ensure that the clients/servers are operational and reporting properly

What we offer

Competitive salaries
Continuing education assistance
Professional development
Multiple healthcare benefits package options
401K with employer matching
Competitive time off policy along with a federally recognized holiday schedule

Fulltime

Lead Engineer, Ml Network Stack - Annapurna Labs

We are seeking an experienced engineer and technical leader to join our team tha...

Location

United States , Seattle; Cupertino

Salary:

168100.00 - 261500.00 USD / Year

Amazon

Expiration Date

Until further notice

Requirements

5+ years of non-internship professional software development experience
5+ years of leading design or architecture (design patterns, reliability and scaling) of new and existing systems experience
5+ years of full software development life cycle, including coding standards, code reviews, source control management, build processes, testing, and operations experience
3+ years as a mentor, tech lead or leading engineering teams
3+ years experience in SW/HW Co-Design

Job Responsibility

Be the lead engineer on a team that builds and maintains the infrastructure that monitors and reports on functionality and performance of massive testing workloads run at scale
Use internal Amazon CI/CD tools, Linux, and public AWS products to automate the delivery of our software to customers, saving developer time
Write Python code that effortlessly spools up large clusters and runs benchmarks and applications for ML and HPC workloads
Use AWS Managed Grafana and Athena to digest the massive amount of performance data generated by these workloads and create dashboards for developers and stakeholders
Invent automatic mechanisms to alert developers to functional and performance regressions so they never reach customers
Manage the complexity of infrastructure that covers many instance types, software stacks, Linux operating systems, cutting-edge releases and make it easy to evolve

What we offer

health insurance (medical, dental, vision, prescription, Basic Life & AD&D insurance and option for Supplemental life plans, EAP, Mental Health Support, Medical Advice Line, Flexible Spending Accounts, Adoption and Surrogacy Reimbursement coverage)
401(k) matching
paid time off
parental leave
sign-on payments
restricted stock units (RSUs)

Fulltime

Select Country

HPC Lead

Job Description

Job Responsibility

Requirements

Looking for more opportunities?

HPC Lead

Hpc Operations Lead

HPC Operations Lead

HPC Supercomputer Onsite Administrator Team Lead

Senior Software Engineer- ML Network Stack

Construction Delivery Lead

Principal Product Manager - Virtualization Architect

Senior Cybersecurity Engineer

Lead Engineer, Ml Network Stack - Annapurna Labs

Our AI answers in your language