CrawlJobs Logo

AI/HPC System Performance Engineer

meta.com Logo

Meta

Location Icon

Location:
United States , Austin

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

219000.00 - 301000.00 USD / Year

Job Description:

Meta's AI Training and Inference Infrastructure is growing exponentially to support ever increasing use cases of AI. This results in a dramatic scaling challenge that our engineers have to deal with on a daily basis. We need to build and evolve our network infrastructure that connects myriads of training accelerators like GPUs together. In addition, we need to ensure that the network is running smoothly and meets stringent performance and availability requirements of RDMA workloads. These workloads expect a loss-less fabric interconnect with minimal latency. To improve performance of these systems we constantly look for opportunities across stack: network fabric and host networking, communications lib and scheduling infrastructure.

Job Responsibility:

  • Lead multi-disciplinary teams to develop solutions for large scale training systems. Assess trade-offs of various solutions and make pragmatic decisions
  • Ensure timely milestone delivery with teamwork and close collaboration
  • Responsible for the overall performance of the communication system, including performance benchmarking, monitoring and troubleshooting production issues
  • Defining technical vision and driving a multi-year roadmap to make progress towards the related objectives
  • Work with cross functional teams and provide guidance on the AI network architecture including topologies, transport, congestion control techniques

Requirements:

  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Experience with developing, evaluating and debugging host networking protocols such as RDMA
  • 10+ years of experience in designing, deploying and operating networks
  • Experience with triaging performance issues in complex scale-out distributed applications

Nice to have:

  • Experience with developing communication libraries, such as Message Passing Interface, NCCL, and UCX
  • Understanding of AI training workloads and demands they exert on networks
  • Understanding of RDMA congestion control mechanisms on InfiniBand and RoCE Networks
  • Understanding of the latest artificial intelligence (AI) technologies
  • Experience with machine learning frameworks such as PyTorch and TensorFlow
  • Experience in developing systems software in languages like C++
What we offer:
  • bonus
  • equity
  • benefits

Additional Information:

Job Posted:
January 23, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for AI/HPC System Performance Engineer

Sr AI/HPC Applications and Performance Engineer

Sr AI/HPC Applications and Performance Engineer role at Hewlett Packard Enterpri...
Location
Location
United States
Salary
Salary:
161500.00 - 370500.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 15+ years' experience
  • Deep expertise in AI and HPC applications and performance engineering including simulation, modeling and emulation capabilities
  • Expertise in large-scale AI and HPC systems
  • Experience architecting, designing, and developing innovative software system design tools and languages
  • Excellent analytical and problem-solving skills
  • Experience in leading overall architecture of software systems for products and solutions
  • Designing and integrating efficient and scalable software systems running on multiple platform types into overall architecture
  • Evaluating and selecting forms and processes for software systems testing and methodology
  • History of innovation with multiple patents or deployed solutions in the field of software design
  • Excellent written and verbal communication skills
Job Responsibility
Job Responsibility
  • Develops organization-wide architectures, strategies, and methodologies for software systems design and development across multiple platforms and organizations
  • Identifies and makes informed recommendations regarding new technologies, innovations, and outsourced development partner relationships
  • Reviews, evaluates, and influences designs and project activities for compliance with development guidelines and standards
  • Provides tangible solutions that improve product quality and mitigate failure risk
  • Contributes to domain expertise, business acumen, and experience to influence decisions of executive business leadership
  • Brings creativity and innovation to the organization
  • Provides guidance and mentoring to less-experienced team members
  • Acts as an internal authority on software systems design
  • Contributes to the external technical community through whitepapers, patents, or other significant innovations
What we offer
What we offer
  • Health & Wellbeing benefits
  • Personal & Professional Development programs
  • Unconditional Inclusion environment
  • Comprehensive benefits suite supporting physical, financial and emotional wellbeing
  • Fulltime
Read More
Arrow Right

Software Engineer - AI/HPC Specialist

We are looking for software engineers to help scale and improve the efficiency o...
Location
Location
Norway , Oslo
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of experience developing in C++/C and Python
  • Experience with High Performance Computing/Networking or AI systems applications frameworks
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Specialized experience in one or more of the following machine learning/deep learning domains: Hardware accelerators, AI Infrastructure, or high performance networking
  • Solid experience in debugging of distributed systems, revision control systems, testing, and CI pipelines
Job Responsibility
Job Responsibility
  • Work on collective communications stacks to optimise networking operations, leading to improved AI inference and training model performance
  • Drive implementation of latency and bandwidth critical networking operations, as well as out-of-band signalling
  • Debug custom and third party multi-host, accelerator enabled AI platforms
  • Software development using C++/C and Python
  • Work closely with other teams to deliver impact
  • develop & improve features and innovations
  • Extend and optimize large scale learning collective operations
Read More
Arrow Right

AI Research Lab Research Associate

We are currently seeking highly qualified interns to accelerate research towards...
Location
Location
United States , Milpitas
Salary
Salary:
43.27 - 93.15 USD / Hour
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
May 26, 2026
Flip Icon
Requirements
Requirements
  • Pursuing PhD degree (or other degree with significant research and innovation experience) in a relevant discipline (e.g. machine learning, computer science, electrical engineering, math, statistics, etc.)
  • Track record of world-class innovative contributions and ideas in machine learning
  • Experience with innovative solution development, such as developing proofs-of-concept, first-of-a-kind solutions, and/or technology transfer
  • Experience in deep learning research
  • Experience in developing deep learning software with high proficiency in data structures and algorithms
  • Strong programming skills and experience with Python, C/C++, and preferably Java
  • Software development experience in Deep Learning, GPU acceleration, and Model Optimization
  • Experience in Deep Learning and Machine Learning frameworks and models like Tensorflow, PyTorch
  • Experience in Transformer Neural Network architectures for Generative AI and natural language processing
  • Experience with Agentic AI and Generative AI workflows - desired
Job Responsibility
Job Responsibility
  • Conduct research and come up with solutions with a fast turnaround time
  • Build the software and applications for Neural Networks and Machine Learning
  • Work with system programming, Deep Learning frameworks and models, GPU acceleration, Model optimization, real-time streaming data, distributed computing, and deployment
  • Provide thought leadership and technical influence both internally and externally to HPE
  • Collaborate with HPE Labs research teams as well as external partners
  • Work in alignment with HPE's broader innovation community.
What we offer
What we offer
  • Health & Wellbeing benefits including physical, financial and emotional wellbeing support
  • Personal and professional development programs
  • Unconditional inclusion and flexibility to manage work and personal needs.
  • Fulltime
Read More
Arrow Right

Senior Solutions Architect - Data Infrastructure

NetApp is the intelligent data infrastructure company, turning a world of disrup...
Location
Location
United States
Salary
Salary:
205700.00 - 266200.00 USD / Year
netapp.com Logo
NetApp
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8+ years in solution architecture, systems engineering, or enterprise pre-sales for storage or data infrastructure platforms, with a strong track record of driving technical wins and customer outcomes
  • Executive Presence & Communication. Exceptional presentation, storytelling, and whiteboarding skills, with the ability to lead technical workshops and executive briefings
  • Technical Depth. Expertise across NFS, SMB, iSCSI, FC, NVMe, and S3
  • experience with virtualization and container platforms (e.g., VMware, Kubernetes)
  • and strong understanding of security, cyber resilience, and AI-adjacent technologies
  • Hybrid Cloud Knowledge. Practical experience with hyperscaler file and object services, data mobility, and replication strategies
  • Solution Design Skills. Comfortable producing reference architectures and integration plans spanning compute, networking, and storage
Job Responsibility
Job Responsibility
  • Own Technical Win Plans. Partner with enterprise sales and field leadership on priority opportunities. Lead discovery, shape solution strategy, differentiate competitively, and drive the technical win for large, complex deals
  • Design End-to-End Architectures. Create scalable, resilient, and future-ready architectures across on-prem, cloud-adjacent, and public cloud environments, aligned to customer requirements for performance, availability, security, and total cost of ownership
  • Act as a Portfolio Evangelist. Represent NetApp’s full data infrastructure vision to customers, partners, and internal stakeholders, connecting portfolio capabilities to real-world customer outcomes
  • Build Trusted Executive Relationships. Develop and sustain deep relationships with customer technical and business leaders, partners, and alliances. Drive engagement across executive, architecture, and engineering communities
  • Generate Pipeline with Marketing. Lead webinars, workshops, and Executive Briefing Center sessions
  • contribute to blogs and video content
  • present at NetApp INSIGHT
  • and support regional demand-generation events to open new workloads and buying centers
  • Mentor and Upskill the Field. Coach Solutions Engineers and partner technical teams on solution domains, reference architectures, and repeatable best practices
  • Stay Ahead of the Market. Track industry trends, competitive dynamics, and portfolio evolution to provide timely guidance to customers, sales leadership, and field teams
What we offer
What we offer
  • Volunteer time off
  • 40 hours of paid volunteer time each year
  • Well-being
  • Employee Assistance Program, fitness, and mental health resources to help employees be their best
  • Time away
  • Paid time off for vacation and to recharge
  • Health Insurance
  • Life Insurance
  • Retirement or Pension Plans
  • Paid Time Off
  • Fulltime
Read More
Arrow Right
New

Business Analyst

TMS is undergoing an exciting period of digital transformation, and we are seeki...
Location
Location
United Kingdom , Salford, Manchester; London
Salary
Salary:
Not provided
dssmith.com Logo
DS Smith
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong analytical and problem-solving skills with the ability to translate business needs into practical solutions
  • Excellent communication and interpersonal skills, with confidence engaging stakeholders at all levels
  • High attention to detail and a commitment to producing high-quality outputs
  • Adaptable, proactive, and comfortable working in a fast-paced environment with competing priorities
  • Collaborative approach with the ability to influence and negotiate effectively
  • 3+ years’ experience in a Business Analyst role within IT, digital transformation, or similar environments
  • Familiarity with Agile and Waterfall delivery methodologies
  • Experience in FMCG or print management is highly desirable
  • Proficiency with tools such as M365, Visio, and other business analysis or mapping tools
  • Understanding of customer integration technologies (e.g., APIs, CRM platforms)
Job Responsibility
Job Responsibility
  • Gather, analyse, and document business and functional requirements in collaboration with stakeholders
  • Assess existing processes, identifying opportunities to streamline workflows and improve efficiency through technology
  • Work with technical teams to design and recommend effective digital solutions that enhance business performance and customer experience
  • Support the delivery of multiple concurrent projects, ensuring alignment with strategic goals
  • Facilitate workshops, meetings, and communication between business teams and IT
  • Produce clear documentation including process maps, business cases, user stories, and specifications
  • Support testing and validation activities to ensure solutions meet business expectations
  • Assist with change management, including the creation of training materials and supporting user adoption
  • Contribute to system and data integration initiatives that improve customer-facing and internal processes
  • Apply sector knowledge (FMCG or print management) to ensure solutions meet industry-specific requirements
What we offer
What we offer
  • Competitive salary
  • Qualifying Sick Pay scheme
  • Pension scheme & Life insurance
  • Share Save scheme
  • Income Protection
  • 25 days holiday plus Bank Holidays
  • Employee Assistance Programme
  • Virtual GP, Occupational Health & free Flu vaccine
  • Cycle to Work and shopping discounts
  • Fulltime
Read More
Arrow Right
New

Customer Service Representative

We want you to join our team as a Customer Service Representative. If you have t...
Location
Location
United States of America , El Paso
Salary
Salary:
Not provided
https://www.circlek.com Logo
Circle K
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Selling products to customers
  • Providing excellent customer care
  • Communication and friendly conversation
  • Performing at a quick pace while having fun
  • Working as part of a team to accomplish daily goals
  • Coming up with great ideas to solve problems
  • Thinking quickly and offering suggestions
  • Ability to stand and/or walk for up to 8 hours
  • Lift and/or carry up to 30 pounds from ground to overhead up to 30 minutes in a shift
  • Occasionally lift and/or carry up to 60 pounds from ground to waist level
Job Responsibility
Job Responsibility
  • Greet customers, run the register, cashier, make purchase suggestions and sometimes work with our food program
  • Working around the store (inside and out) in many different areas to help maintain our high standards for store appearance and provide fast and friendly service to our customers
  • Provide regular and predicable onsite attendance
  • Interact with many customers daily, all while working with a fun, energetic team accomplishing daily tasks around the store
What we offer
What we offer
  • Medical, Vision, Dental, & Life Insurance/Short & Long Term Disability
  • Flexible Schedules
  • Weekly Pay
  • Weekly Bonus Potential
  • Large, Stable Employer
  • Fast Career Opportunities
  • Work With Fun, Motivated People
  • Task Variety
  • Paid Comprehensive Training
  • 401K With a Competitive Company Match
Read More
Arrow Right
New

Senior Python Pyspark Engineer

The Applications Development Senior Programmer Analyst is an intermediate level ...
Location
Location
India , Pune
Salary
Salary:
Not provided
https://www.citi.com/ Logo
Citi
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 8 - 10 years of relevant experience
  • Experience in systems analysis and programming of software applications
  • Experience in managing and implementing successful projects
  • Working knowledge of consulting/project management techniques/methods
  • Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
  • Programming Languages:Python, PySpark
  • Data Lake Table Format: Apache Iceberg
  • Data Orchestration:Apache Airflow
  • Data Visualization: Tableau
  • Big Data Processing: Apache Spark
Job Responsibility
Job Responsibility
  • Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
  • Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
  • Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
  • Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
  • Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
  • Ensure essential procedures are followed and help define operating standards and processes
  • Serve as advisor or coach to new or lower level analysts
  • Has the ability to operate with a limited level of direct supervision.
  • Can exercise independence of judgement and autonomy.
  • Acts as SME to senior stakeholders and /or other team members.
  • Fulltime
Read More
Arrow Right
New

Community Marketing Manager - Surface & Edge

Microsoft’s Consumer Marketing Organization’s Social Marketing team is revolutio...
Location
Location
United States , Redmond
Salary
Salary:
85100.00 - 169800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Business, Marketing, Communications, Economics, Public Relations, or related field AND 1+ year(s) community management, social media or related work experience OR equivalent experience
  • Bachelor's Degree in Business, Marketing, Communications, Economics, Public Relations, or related field AND 3+ years community management, social media, or related work experience OR equivalent experience
  • Experience monitoring and analyzing large volumes of community conversation across platforms
  • Experience translating insights into clear recommendations for product and marketing teams
  • Experience in consumer tech, gaming, entertainment, or digital first brands
  • Experience with social listening tools and community analytics methodologies
  • Experience navigating nuanced sentiment, crisis scenarios, and high-velocity discourse
Job Responsibility
Job Responsibility
  • Monitor conversations across Reddit, Discord, social platforms, and key press outlets to identify sentiment shifts, behavioral trends, emerging opportunities, and risks
  • Convert community insights into structured recommendations for product, community engagement, and marketing strategies
  • Act as the internal champion for audience needs, ensuring community findings shape decision making across teams
  • Identify and escalate crisis related conversation trends with context, recommendations, and mitigation paths
  • Ensure all content scheduled in the editorial calendar is published accurately and on time across owned channels
  • Partner closely with creative, social strategy, and content production teams to develop messaging that resonates with the community and reflects real audience motivators
  • Collaborate with cross functional partners to infuse community insights into campaign development, messaging frameworks, and narrative direction
  • Own and operate dedicated social and/or community presences as a trusted point of direct-to-community conversation
  • Build relationships with core audience segments, superusers, advocates, and topic resources in relevant ecosystems
  • Facilitate bidirectional dialogue sharing product updates, gathering feedback, and deepening fan trust and loyalty
  • Fulltime
Read More
Arrow Right