CrawlJobs Logo

Software Engineer, Inference – AMD GPU Enablement

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

295000.00 - 555000.00 USD / Year

Job Description:

We’re hiring engineers to scale and optimize OpenAI’s inference infrastructure across emerging GPU platforms. You’ll work across the stack - from low-level kernel performance to high-level distributed execution - and collaborate closely with research, infra, and performance teams to ensure our largest models run smoothly on new hardware. This is a high-impact opportunity to shape OpenAI’s multi-platform inference capabilities from the ground up with a particular focus on advancing inference performance on AMD accelerators.

Job Responsibility:

  • Own bring-up, correctness and performance of the OpenAI inference stack on AMD hardware
  • Integrate internal model-serving infrastructure (e.g., vLLM, Triton) into a variety of GPU-backed systems
  • Debug and optimize distributed inference workloads across memory, network, and compute layers
  • Validate correctness, performance, and scalability of model execution on large GPU clusters
  • Collaborate with partner teams to design and optimize high-performance GPU kernels for accelerators using HIP, Triton, or other performance-focused frameworks
  • Collaborate with partner teams to build, integrate and tune collective communication libraries (e.g., RCCL) used to parallelize model execution across many GPUs

Requirements:

  • Experience writing or porting GPU kernels using HIP, CUDA, or Triton
  • Familiarity with communication libraries like NCCL/RCCL
  • Experience working on distributed inference systems
  • Ability to solve end-to-end performance challenges across hardware, system libraries, and orchestration layers
  • Ability to thrive in a small, fast-moving team building new infrastructure from first principles

Nice to have:

  • Contributions to open-source libraries like RCCL, Triton, or vLLM
  • Experience with GPU performance tools (Nsight, rocprof, perf) and memory/comms profiling
  • Prior experience deploying inference on other non-NVIDIA GPU environments
  • Knowledge of model/tensor parallelism, mixed precision, and serving 10B+ parameter models
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Engineer, Inference – AMD GPU Enablement

Sovereign AI Field Application Engineer

We are seeking a Senior Field Application Engineer (FAE) to join the Centre of E...
Location
Location
United Kingdom
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrable hands-on expertise working with either popular AI frameworks and models on GPU
  • Experience leading large technical programs or opportunities
  • Strong systems background. Understands and can quantify the impact of system architecture on performance
  • Strong positive can-do attitude willing to do what is necessary and lead others in the wider FAE team by example. Available to help colleagues
  • Skilled in independently prioritizing opportunities to deliver results on time
  • Excellent verbal and written communication skills
  • Based in Europe ideally EU zone
  • Open to travel both domestic and international, approximately 10-20% over a year. Anticipate a ramp period with increased travel at the start
  • Bachelors' Degree in a technical field (Computer Science, Electrical Engineering, Physics, Mathematics) preferred
Job Responsibility
Job Responsibility
  • Support winning new AI business in national AI and HPC centres. Enabling customers to execute their AI workloads on AMD Instinct GPUs, EPYC CPUs, and AI NICs. Supporting partners in RFP responses by testing requested workloads
  • Owning technical qualification of the customer, partnering with Sales and Business Unit orgs
  • Demonstrate and advise customers and partners through Proof of Concepts, presentations, and training
  • Engineering: execute popular and customer-driven AI inference and training workloads, generate results and create a characteristic understanding of AI performance on AMD hardware. Understand how system and software choices affect performance. Compare performance to our competition
  • Run training and inference performance investigations using common frameworks (Pytorch, Tensorflow, JAX) and using MLperf, Hugging Face etc
  • Build a body of documentation for internal and external dissemination: AMD-internal guides, whitepapers, tuning guides, training collateral
  • Provide onsite training
  • Proactive engagement across AMD teams: GPU Business Unit, Engineering, Architecture, Platform, Software, and Product Development teams providing feedback and leadership from the field on requirements. Gathering missing functionality and working with Engineering to resolve and test
  • Assist in creating Total Cost of Ownership models to aid pricing with bid desk
  • Technically owning and resolving customer and partner issues. Submitting JIRA tickets and driving resolution
  • Fulltime
Read More
Arrow Right

Senior x86 Software FAE

We are seeking an experienced and technically skilled Senior x86 Software FAE wi...
Location
Location
Taiwan , Taipei
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Engineering, Electrical Engineering, Computer Science, or Robotics
  • 8+ years of experience in software engineering, FAE support, or robotics application development
  • Strong understanding of x86 platform architecture, Linux internals, and hardware/software integration
  • Hands-on experience with ROCm, HIP, or CUDA programming, as well as GPU-accelerated computing
  • Familiarity with ROS/ROS 2, perception libraries (OpenCV, PCL), and real-time middleware
  • Proficiency in C/C++, Python, and Linux shell for development and debugging
  • Excellent problem-solving and communication skills with a customer-oriented mindset
Job Responsibility
Job Responsibility
  • Provide technical support and solution enablement for customers using AMD x86 platforms and ROCm GPU stack in robotics and AI workloads
  • Collaborate with engineering teams to optimize AI/ML inference pipelines, vision processing, and motion control frameworks on ROCm-enabled hardware
  • Lead software bring-up, benchmarking, and performance analysis for robotics use cases involving GPU, CPU, and heterogeneous compute
  • Work closely with ODM/OEM partners on ROCm deployment, driver tuning, and software validation
  • Act as the technical liaison between customers and internal software/hardware teams to resolve system-level issues
  • Deliver technical training and workshops on ROCm, HIP, and robotics software stack enablement
  • Contribute to solution collateral, whitepapers, and reference designs targeting industrial robotics and AI applications
Read More
Arrow Right
New

Director Software Development

At AMD, we are enabling the next generation of AI innovation by leveraging the p...
Location
Location
China , Shanghai
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years in AI/ML software development
  • 5+ years in leadership roles managing AI model enablement or optimization teams
  • Expertise in optimizing real-time AI models for deep learning applications (computer vision, NLP, etc.)
  • Proficiency with AI frameworks (TensorFlow, PyTorch, ONNX Runtime, JAX, Triton) and their optimization for GPU architectures
  • Strong background in optimizing software for AMD GPUs or similar high-performance platforms
  • Familiarity with ROCm is a plus
  • Proven experience with performance optimization, benchmarking, and scaling AI models on GPUs
  • Exceptional ability to collaborate cross-functionally and define long-term strategies for AI/ML innovation
  • Strong verbal and written communication skills, with experience presenting to senior leadership and working with customers and partners
  • Advanced degree (Master’s or PhD) in Computer Science, Electrical Engineering, AI/ML, or related field
Job Responsibility
Job Responsibility
  • Lead and develop teams responsible for AI inference model enablement and optimization
  • Direct efforts to optimize AI frameworks for seamless compatibility and performance on AMD GPUs (Instinct, Navi)
  • Oversee benchmarking, performance tuning, and optimization of AI inference models to improve latency, throughput, and efficiency on AMD hardware
  • Partner with hardware, software, and QA teams to ensure tight integration of AI frameworks with ROCm for maximum performance
  • Drive AI model optimization innovations, enhancing the speed, efficiency, and scalability of AI workloads
  • Lead the vision and strategy for optimizing AI inference on AMD GPUs
  • Collaborate with customers and open-source communities to ensure that AMD’s AI solutions meet industry needs, fostering contributions to MIGraphX, vLLM, and other AMD AI Framework Inference teams
  • Oversee automation frameworks to streamline model integration and performance testing, ensuring scalability across diverse AI workloads
Read More
Arrow Right

AI Model, Framework, and GPU Engineer

We are looking for an experienced Machine Learning Software Engineer who will be...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong technical and analytical skills in C/C++/Python AI development in Windows and Linux environment
  • Some knowledge on GPU programming and compiler
  • Capable problem solver
  • Technical leader to define goals and scope and drive development effort
  • Good communication skills
  • Enthusiastic about AI technologies
  • Strongly motivated to enable customers with best feature-rich efficient solutions
  • Strong cross-platform software development experience and deep programming skills in C/C++ and Python
  • Excellent problem-solving and effective communication skills
  • Development experience on CONV, GEMM, and/or non-linear operators
Job Responsibility
Job Responsibility
  • Develop and deliver innovative AI software solutions to AMD customers and users
  • Enable and optimize software stack for standard frameworks like ONNX and PyTorch, as well as new popular Open-Source AI software
  • Bring up new SOTA AI models, analyze and improve their performance
  • Participate and drive end-2-end AI software development from feature scoping, implementation, integration and verification, to customer enablement
Read More
Arrow Right

Software Engineer II and Senior Software Engineer - Performance

The Artificial Intelligence Performance team at Microsoft develops AI software t...
Location
Location
United States , Mountain View
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Identify and drive improvements to end-to-end inference performance of OpenAI and other state-of-the-art LLMs
  • Measure, benchmark performance on Nvidia/AMD GPUs and first party Microsoft silicon
  • Optimize and monitor performance of LLMs and build SW tooling to enable insights into performance opportunities ranging from the model level to the systems and silicon level to improve customer experience and reduce the footprint of the computing fleet
  • Enable fast time to market of LLMs/models and their deployments at scale by building SW tools that afford velocity in porting models on new Nvidia and AMD GPUs
  • Design, implement, and test functions or components for our AI/DNN/LLM frameworks and tools
  • Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems
  • Communicate and collaborate with our partners both internal and external
  • Embody Microsoft's Culture and Values
  • Fulltime
Read More
Arrow Right
New

Application Engineering Lead – OSS

We're seeking a technically strong and community-focused AI Developer Community ...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years of experience in AI/ML, GPU computing, developer relations, technical marketing, or community leadership
  • Strong hands-on experience with AI/ML frameworks such as PyTorch, TensorFlow, or JAX, with understanding of model training and inference workflows
  • Practical knowledge of GPU acceleration concepts and performance optimization fundamentals
  • Demonstrated experience building, scaling, or managing technical developer communities or ecosystems
  • Experience organizing and executing technical events such as workshops, hackathons, seminars, or university programs
  • Open-source mindset with experience contributing to or working closely with community-driven software projects
  • Strong communication and presentation skills, with ability to simplify complex technical concepts for diverse audiences
  • Strong program management skills and ability to execute multi-city or multi-country engagement initiatives
  • Data-driven approach with ability to define KPIs and track ecosystem growth metrics
  • Enthusiasm for community-building, creating content, and helping developers succeed
Job Responsibility
Job Responsibility
  • Lead community growth and engagement strategy for AMD’s AI and ROCm ecosystem across India and APAC
  • Drive India- and Asia Pacific-specific engagement programs including developer workshops, technical seminars, webinars, hackathons, meetups, and university initiatives
  • Build and nurture a strong AI developer community focused on AI/ML workloads running on AMD GPUs
  • Engage with open-source projects and their developer communities to encourage contributions aligned with AMD platforms
  • Enable and motivate developers to contribute code, documentation, benchmarks, tutorials, and integrations back to AMD-supported open-source software
  • Collaborate closely with ROCm engineering and product teams to channel community feedback into actionable improvements
  • Represent AMD at regional AI conferences, industry forums, and academic events as a technical and community evangelist
  • Develop high-quality technical content including blogs, tutorials, demos, and reference materials demonstrating best practices for AI development on AMD GPUs
  • Provide technical guidance and first-line enablement to developers adopting AMD GPUs in AI/ML workflows
  • Track community engagement metrics, contribution growth, and ecosystem health to measure program effectiveness
Read More
Arrow Right

Product Manager - AI Data Center Infrastructure

Product Manager - AI Data Center Infrastructure. We are seeking a Product Line M...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5–10+ years of experience in data center networking, AI infrastructure, or HPC environments
  • Strong hands-on experience with Juniper QFX platforms and JunOS
  • Deep understanding of GPU architectures: NVIDIA: H100/H200, GB200/GB300, NVLink/NVSwitch AMD: MI300/MI400, Pollara NICs, Infinity Fabric
  • Proven expertise in scale-up GPU interconnects and scale-out Ethernet fabrics
  • Strong knowledge of RDMA/ROCEv2, ECN, PFC, and buffer management
  • Familiarity with distributed AI workloads, collective operations (NCCL, RCCL)
  • Hands-on troubleshooting experience with high-speed optics, AEC cables, link training, and NIC firmware
  • Proficiency in automation and scripting (Python, Ansible, Bash, Terraform)
Job Responsibility
Job Responsibility
  • AI Data Center & Fabric Architecture: Define product requirements for AI data center network architectures supporting thousands of GPUs
  • Develop requirements for low-latency Ethernet fabrics using Juniper QFX platforms and Apstra-based automation
  • Enable high-bandwidth GPU and NIC interconnects optimized for large-scale distributed training and inference workloads
  • GPU, NIC & Interconnect Strategy: Lead requirements definition for next-generation GPUs, NICs, and interconnect technologies, staying ahead of industry roadmaps
  • Drive alignment with NVIDIA and AMD ecosystems
  • Ensure interoperability across DAC, AEC, ACC, and optical transceivers between switches and NIC endpoints
  • Define scale-up paths using PCIe, NVLink, NVSwitch, ensuring GPU-to-GPU symmetry, consistency, and bandwidth determinism
  • Switching, Routing & Telemetry: Specify and optimize L2/L3 architectures, including EVPN-VXLAN, Class-E IPv4, and AI-optimized buffer tuning
  • Leverage hardware telemetry, streaming sensors, and analytics for proactive performance assurance
  • Drive automation using Python, Ansible, Apstra, Terraform, and related tools to enforce configuration consistency and compliance
What we offer
What we offer
  • Health & Wellbeing: comprehensive suite of benefits that supports physical, financial and emotional wellbeing
  • Personal & Professional Development: specific programs catered to helping you reach any career goals
  • Unconditional Inclusion: unconditionally inclusive in the way we work and celebrate individual uniqueness
Read More
Arrow Right

Principal Software Engineer - Performance

The Artificial Intelligence Cloud Inference team at Microsoft develops AI softwa...
Location
Location
United States , Mountain View
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings: Microsoft Cloud Background Check:This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.
Job Responsibility
Job Responsibility
  • Identify and drive improvements to end-to-end inference performance of OpenAI and other state of the art LLMs
  • Measure, benchmark performance on Nvidia/AMD GPU's and first party Microsoft silicon
  • Optimize and monitor performance of LLMs and build SW tooling to enable insights into performance opportunities ranging from the model level to the systems and silicon level, help reduce the footprint of the computing fleet and achieve Azure AI capex goals
  • Enable fast time to market of LLMs/models and their deployments at scale by building SW tools that afford velocity in porting models on new Nvidia, AMD GPUs and Maia silicon
  • Design, implement, and test functions or components for our AI/DNN/LLM frameworks and tools
  • Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems
  • Communicate and collaborate with our partners both internal and external
  • Embody Microsoft's Culture and Values
  • Fulltime
Read More
Arrow Right