This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking an experienced Senior Linux System Administrator/System Support Engineer with expertise supporting High Performance Computing (HPC) environments to join our HPC support team team. In this role, you will design, implement, maintain, and optimize Linux-based infrastructure—ensuring high availability, security, and performance for mission-critical systems and services, including complex HPC platforms. You will provide advanced technical support, troubleshoot challenging issues across hardware and software, and act as a trusted advisor to both internal teams and external customers. This is an exciting opportunity for a highly skilled professional to contribute to mission-critical systems in a collaborative and challenging environment. On-site presence is mandatory to deliver exceptional customer support and maintain system performance.
Job Responsibility:
Deploy, configure, maintain, and troubleshoot Linux servers and HPC clusters systems (Red Hat, CentOS, Ubuntu, or others) across physical (primarily), virtual, and cloud environments
Support, maintain, and optimize HPC systems, including cluster manager, operating system and network fabric installation, servicing, and advanced technical troubleshooting of hardware/software and parallel file systems (e.g., Lustre, GPFS)
Monitor system performance, availability, and security using industry-standard tools and practices
ensure compliance with organizational policies and external regulations
Plan and execute upgrades, patches, enhancements, and migrations to ensure systems are current, secure, and optimized
Automate system administration tasks using scripting languages (Bash, Python, Perl, etc.) and configuration management tools (Ansible, Puppet, Chef, Terraform)
Implement and maintain backup/recovery strategies, disaster recovery plans, and system documentation
Collaborate with development, network, and security teams to support application deployments and troubleshoot issues, particularly in multi-technology HPC environments
Provide technical consulting, mentoring, and guidance to junior team members and contribute to internal knowledge sharing
Ensure compliance with strict security protocols in sensitive environments (e.g., government, research)
TSPV clearance will be required
Participate in on-call rotation and respond to system incidents and outages
Assist with technical proposals, solution design, and enterprise-level architecture for new projects and customer engagements
Requirements:
Bachelor’s degree in Computer Science, Information Technology, or related field, or equivalent work experience
At least 5 years of hands-on experience managing Linux systems in production environments, including HPC systems
Expertise in Linux/Unix operating systems, parallel file systems (Lustre, GPFS), and networking technologies
Proficiency in scripting/programming languages (Bash, Python, Perl, C++)
Experience with automation/configuration management tools (Ansible, Puppet, Chef, Terraform)
Strong understanding of networking concepts (TCP/IP, DNS, DHCP, firewalls, VPNs)
Familiarity with monitoring/logging tools (Nagios, Grafana, ELK Stack)
Experience with containerization technologies (Docker, Kubernetes)
Excellent problem-solving, analytical, and communication skills
able to diagnose complex technical problems to root cause
Demonstrated ability to work independently in multi-technology environments and collaborate across teams
TSPV Government Security clearance (mandatory)
Nice to have:
Relevant certifications (RHCE, LFCS, AWS Certified SysOps Administrator, etc.) are a plus
Accountability
Active Learning
Active Listening
Bias
Business Growth
Client Expectations Management
Coaching
Creativity
Critical Thinking
Cross-Functional Teamwork
Customer Centric Solutions
Customer Relationship Management (CRM)
Design Thinking
Empathy
Follow-Through
Growth Mindset
Information Technology (IT) Infrastructure
Infrastructure as a Service (IaaS)
Intellectual Curiosity
Long Term Planning
Managing Ambiguity
Process Improvements
Product Services
Relationship Building
What we offer:
Health & Wellbeing: comprehensive suite of benefits that supports physical, financial and emotional wellbeing
Personal & Professional Development: specific programs catered to helping you reach any career goals
Unconditional Inclusion: unconditionally inclusive in the way we work and celebrate individual uniqueness