CrawlJobs Logo

Machine Learning Systems Administrator - HPC Infrastructure

zyphra.com Logo

Zyphra

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Machine Learning Systems Administrator - HPC Infrastructure, you will be responsible for maintaining and developing the core infrastructure behind our machine learning research and production efforts. You’ll work closely with various training and inference teams to ensure the smooth operation of our systems while laying the groundwork for scalable, secure, and efficient workflows. You’ll have a significant impact on both developer productivity and training and inference performance.

Job Responsibility:

  • Maintaining and developing the core infrastructure behind our machine learning research and production efforts
  • Administration and automation of our Linux-based cluster environments
  • Managing user onboarding/offboarding, security auditing, and access control
  • Monitoring system resources and job scheduling
  • Supporting and improving developer workflows (e.g., VSCode compatibility, Docker)
  • Enabling and supporting AI/ML workloads, including large-scale training jobs

Requirements:

  • Strong experience with Linux system administration, user and access management, and automation
  • Demonstrated expertise in scripting languages for system tooling and automation (bash, Python, etc.)
  • Familiarity with containerized environments (e.g., Docker) and job scheduling systems like Slurm
  • Experience building tooling for cluster validation and reliability (GPU, networking, storage health checks)
  • Experience setting up and managing developer tools and third-party services (e.g, Cloud storage providers, Dockerhub, Slack, Gmail, Telegraf, experiment trackers, etc.)
  • Excellent debugging and troubleshooting skills across compute, storage, and networking
  • Strong communication skills and ability to collaborate across technical and non-technical teams

Nice to have:

  • Experience with infrastructure as code (e.g., Ansible, Terraform)
  • Prior work supporting ML/AI infrastructure, including GPU management and workload optimization
  • Exposure to backend development for ML model serving (e.g., vLLM, Ray, SGLang)
  • Experience working with cloud platforms such as AWS, Azure, or GCP
  • Familiarity with containers (Docker, Apptainer) and their integration with scheduling systems (Slurm, Kubernetes)
What we offer:
  • Comprehensive medical, dental, vision, and FSA plans
  • Competitive compensation and 401(k)
  • Relocation and immigration support on a case-by-case basis
  • On-site meals prepared by a dedicated culinary team
  • Thursday Happy Hours

Additional Information:

Job Posted:
February 14, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Machine Learning Systems Administrator - HPC Infrastructure

Senior Solution Architect AI & HPC

AI is a high-growth market for HPE, and we believe we are uniquely suited to bri...
Location
Location
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's or Master's degree in Engineering, Computer Science, or similar quantitative focus preferred
  • Ability to quickly prototype functionality into scripts for demos, integrations, troubleshooting, etc.
  • Expertise in cloud architectures, specifically with public cloud platforms such as AWS, Azure, or Google Cloud
  • Strong understanding of AI technologies, including machine learning, deep learning, and neural networks
  • Experience participating in solution configurations and the creation of PoCs to meet customer requirements
  • Solid knowledge of infrastructure components, including servers, storage, networking, and virtualization
  • Experience with high-performance computing (HPC) and GPU-accelerated systems is advantageous
  • Demonstrates expert technical skills in assigned area of specialization
  • Expert knowledge of the company offerings, strategic initiatives, current trends, competitor products and strategies within area of responsibility
  • Expert level written and verbal communication skills and mastery over English and local language
Job Responsibility
Job Responsibility
  • Collaborate with sales teams to understand customer requirements and develop tailored solutions for their AI infrastructure needs
  • Engage in pre-sales activities, including technical presentations, demonstrations, and proof-of-concepts
  • Act as a trusted advisor to customers, addressing their questions, concerns, and technical challenges effectively
  • Stay up-to-date with the latest advancements in AI technologies, cloud architectures, and infrastructure trends
  • Lead Proof-of-Concepts (PoC) for HPE customers expanding into Deep Learning or Machine Learning use cases
  • Architect reusable end-to-end AI solutions for HPE customers and prospects
  • Lead technical discussions with customers and partners to propose HPE and partner Integrated solutions
  • Identify solutions, define action plans, and help coordinate and deliver optimal solutions and enhancements
  • Recommend configurations and settings for different types of hardware and interconnect fabrics
  • Assist in any product or technical issue towards an initial sale or renewal of a customer
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right
New

Associate Director, Business Consultancy

Lead and scale a strategic, AI‑enabled consulting organisation that delivers mea...
Location
Location
United Kingdom
Salary
Salary:
Not provided
bloomreach.com Logo
Bloomreach
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Lead and scale a strategic, AI‑enabled consulting organisation
  • Own the vision, performance, and evolution of the Business Consultancy function and the Service Delivery Centre (SDC)
  • Set and continually refine the strategy for Business Consultancy and the SDC in EMEA
  • Own regional performance for Business Consultancy and SDC
  • Represent Business Consultancy in EMEA leadership forums
  • Champion a data‑driven, AI‑enabled consulting approach
  • Define and oversee standards for implementation and onboarding quality across EMEA
  • Drive the creation and adoption of scalable delivery frameworks, playbooks, and QA models
  • Ensure platform adoption is maximised from day one
  • Act as a senior strategic advisor for key clients
Job Responsibility
Job Responsibility
  • Strategic leadership & regional impact
  • Service delivery strategy & client value
  • Service Delivery Centre (SDC) leadership
  • Scaling Professional Services for growth
  • AI fluency & innovation
  • People leadership & organisational development
  • Cross‑functional leadership & continuous improvement
What we offer
What we offer
  • A great deal of freedom and trust
  • Flexible working hours
  • Work virtual-first with several Bloomreach Hubs available across three continents
  • Company events
  • 5 paid days off to volunteer
  • People Development Program
  • Communication coach available
  • Leader Development Program
  • $1,500 professional education budget annually
  • Employee Assistance Program with counselors
  • Fulltime
Read More
Arrow Right
New

Coordinator, Philanthropy & Community Impact

The Coordinator of Philanthropy and Community Impact will provide support across...
Location
Location
United States , Las Vegas
Salary
Salary:
Not provided
ufc.com Logo
UFC
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s degree in communications, or marketing preferred
  • 3+ years of experience working with philanthropies, nonprofits, or charitable organizations
  • Strong research skills and computer skills (including Word, PowerPoint and Excel)
  • Experience managing conception, development and distribution of annual reports and newsletters
  • Ability to project manage, work with multiple colleagues as part of a virtual team, gather and synthesize content and successfully brief and collaborate with design teams
  • Excellent organizational and time management skills, ability to multi-task, analyze and coordinate multiple, on-going projects
  • Ability to maintain discretion and confidential information
Job Responsibility
Job Responsibility
  • Assists in the development and execution of UFC’s global philanthropic and impact strategy
  • Researches and fosters relationships with non-profit organizations that align with UFC’s charitable mission and goals
  • Supports internal and external CSR reporting initiatives, updates and ongoing management
  • Responsible for updating CSR documentation (filing documents, presentations, uploading assets, quarterly and weekly updates and communication)
  • Coordinates CSR activations as part of select fight week events, including managing participation of UFC athletes in select community events
  • Assists in the development of fundraising events and programs for the UFC Foundation
  • Supports the preparation and execution of UFC Foundation fundraising initiatives at select events, including 50/50 raffles, silent auctions, and experience fulfillment
  • Travels to assigned UFC events
  • Other projects, tasks, and duties as assigned
  • Fulltime
Read More
Arrow Right
New

Non-Acute Sales Consultant

Performance Health is seeking a Non-Acute Sales Consultant to join our expanding...
Location
Location
United States , Newark, New Jersey
Salary
Salary:
70000.00 - 75000.00 USD / Year
performancehealth.com Logo
Performance Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's degree
  • 3-5 years of B2B sales experience
  • comfortable selling via phone, video, and in person
  • Experience selling a curated portfolio and product programs
  • delivering trials/in-services
  • Proficiency with CRM software and Microsoft Office (Word, Excel, PowerPoint)
  • Strong communication, customer service, and executive-presence skills
  • Ability to travel 50-75% of the time, including overnight travel
Job Responsibility
Job Responsibility
  • OPR Chain Execution – ~80%: Proactively build and strengthen relationships with OPR groups ranging from 10–75 locations
  • Dedicate up to four days per week traveling to meet with both existing and prospective customers, driving growth opportunities and expanding market presence
  • Lead the activation and expansion of awarded OPR programs across assigned chains and key regional accounts by ensuring seamless contract compliance, formulary alignment, and site-level adoption
  • Conduct executive-to-site handoffs from the Corporate OPR Leader
  • translate enterprise terms into clear site-level actions, timelines, and KPIs
  • Lead field execution of conversions, item substitutions, product trials, and in-services
  • coordinate assets and training to accelerate adoption
  • Build multi-site account plans
  • run periodic business reviews (QBRs) with chain stakeholders
  • identify cross-site expansion opportunities
What we offer
What we offer
  • healthcare
  • insurance benefits
  • retirement programs
  • paid time off plans
  • family and parenting leaves
  • wellness programs
  • discount purchase programs
  • Fulltime
Read More
Arrow Right
New

Senior Systems Analyst – Web & Mobile

We are seeking an experienced Senior Systems Analyst to support the design, deli...
Location
Location
South Africa , Johannesburg
Salary
Salary:
Not provided
myn.co.uk Logo
Myn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong analytical skills
  • Deep experience with digital channels
  • Solid understanding of banking systems, regulatory requirements, and customer-centric digital solutions
  • Experience eliciting, analyzing, documenting, and validating business and functional requirements for web and mobile banking applications
  • Experience translating business needs into functional specifications, user stories, process flows, and system designs
  • Experience acting as a bridge between business stakeholders and technical teams
  • Experience supporting the design and enhancement of web and mobile applications (iOS, Android, responsive web)
  • Experience collaborating with UX/UI teams
  • Experience ensuring solutions are scalable, secure, and aligned with enterprise architecture standards
Job Responsibility
Job Responsibility
  • Elicit, analyze, document, and validate business and functional requirements for web and mobile banking applications
  • Translate business needs into functional specifications, user stories, process flows, and system designs
  • Act as a bridge between business stakeholders and technical teams to ensure clarity and alignment
  • Support the design and enhancement of web and mobile applications (iOS, Android, responsive web)
  • Collaborate with UX/UI teams to ensure optimal customer experience and usability
  • Ensure solutions are scalable, secure, and aligned with enterprise architecture standards
Read More
Arrow Right
New

Senior Android Developer

Location
Location
South Africa , Johannesburg
Salary
Salary:
Not provided
myn.co.uk Logo
Myn
Expiration Date
March 05, 2026
Flip Icon
Requirements
Requirements
  • Strong experience building production-grade Android applications
  • Solid understanding of mobile application architecture patterns (e.g., MVVM, Clean Architecture)
  • Experience integrating Android applications with REST APIs and request/response models
  • Knowledge of the payments ecosystem, including standards such as ISO 8583 and ISO 20022
  • Understanding of payment security standards including OWASP, PCI DSS, and PA-DSS
  • Strong knowledge of cybersecurity principles and mobile security best practices
  • Proficiency in Kotlin and/or Java
  • familiarity with C# and cross-platform frameworks (.NET MAUI / Xamarin) is an advantage
  • Experience working with third-party SDKs, libraries, and custom dependencies, including debugging dependency conflicts
  • Familiarity with Android Studio, Gradle, and Android native build tooling
Job Responsibility
Job Responsibility
  • Design, develop, and maintain high-quality Android applications that enable modern, secure payment capabilities
  • Architect scalable and maintainable Android solutions within a multi-team, enterprise environment
  • Collaborate closely with product owners, backend engineers, QA, and UX teams to deliver seamless mobile experiences
  • Apply best practices for security, performance, and reliability in mobile applications
  • Stay current with emerging Android technologies, tools, and industry trends
  • Contribute to engineering standards, code reviews, and technical decision-making
  • Drive continuous improvement and help build a world-class mobile engineering team
Read More
Arrow Right
New

Associate Due Diligence coordinator-KYC

Wells Fargo is seeking a Associate Due Diligence coordinator. We are looking for...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
https://www.wellsfargo.com/ Logo
Wells Fargo
Expiration Date
February 17, 2026
Flip Icon
Requirements
Requirements
  • 6+ months of due diligence experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
Job Responsibility
Job Responsibility
  • Conduct Know your Customer compliance process for Wholesale processes in line with the requirements of the United States of America Patriot Act as well as Wells Fargo corporate Anti-Money Laundering and Bank Secrecy Act policy requirements
  • Pick up relevant samples for data quality exception to assure compliance with as prescribed in the Quality Control framework requirements
  • Participate in and provide compliance support for projects and initiatives with low to high risk to identify, assess and mitigate Bank Secrecy Act and Anti-Money Laundering risk in business activities
  • Analyze risks on escalated, referred, or alerted negative news
  • communicate negative findings to lines of business and supply guidance on course of action
  • Identify and research the patterns, trends, and anomalies in transactional and customer data to detect, prevent, mitigate, and report suspicious activity related to money laundering and terrorist financing
  • Maintain an audit trail of due diligence performed
  • Analyze potentially suspicious activity, which will require the review of historical activity along with customer information
  • Interact with compliance representatives to assess potential unusual activity
  • Maintain program and procedures, making updates as needed
  • Fulltime
!
Read More
Arrow Right
New

Senior iOS Developer

Location
Location
Salary
Salary:
Not provided
myn.co.uk Logo
Myn
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong experience building production-grade iOS applications
  • Solid understanding of iOS architecture patterns (e.g., MVVM, MVC, Clean Architecture)
  • Proficiency in Swift and/or Objective-C
  • familiarity with C# and cross-platform frameworks (.NET MAUI / Xamarin) is an advantage
  • Solid understanding of mobile application integrations with REST APIs and request/response models
  • Knowledge of the payments ecosystem, including standards such as ISO 8583 and ISO 20022
  • Knowledge of payment security standards including OWASP, PCI DSS, and PA-DSS
  • Strong understanding of cybersecurity principles and mobile security best practices
  • Experience working with third-party SDKs, libraries, and custom dependencies, including troubleshooting dependency conflicts
  • Familiarity with Xcode, iOS SDKs, CocoaPods, Swift Package Manager, and native build processes
Job Responsibility
Job Responsibility
  • Design, develop, and maintain high-quality iOS applications that enable modern, secure payment capabilities
  • Architect scalable, maintainable iOS solutions within a multi-team, enterprise environment
  • Collaborate with product managers, backend engineers, QA, and UX teams to deliver seamless customer experiences
  • Ensure iOS applications meet security, performance, reliability, and compliance standards
  • Stay current with the latest iOS technologies, frameworks, and Apple platform updates
  • Contribute to code reviews, technical design discussions, and engineering best practices
  • Strive for engineering excellence and actively contribute to building a world-class mobile engineering team
Read More
Arrow Right