CrawlJobs Logo

Software Engineer, Inference – AMD GPU Enablement

openai.com Logo

OpenAI

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

295000.00 - 555000.00 USD / Year

Job Description:

We’re hiring engineers to scale and optimize OpenAI’s inference infrastructure across emerging GPU platforms. You’ll work across the stack - from low-level kernel performance to high-level distributed execution - and collaborate closely with research, infra, and performance teams to ensure our largest models run smoothly on new hardware. This is a high-impact opportunity to shape OpenAI’s multi-platform inference capabilities from the ground up with a particular focus on advancing inference performance on AMD accelerators.

Job Responsibility:

  • Own bring-up, correctness and performance of the OpenAI inference stack on AMD hardware
  • Integrate internal model-serving infrastructure (e.g., vLLM, Triton) into a variety of GPU-backed systems
  • Debug and optimize distributed inference workloads across memory, network, and compute layers
  • Validate correctness, performance, and scalability of model execution on large GPU clusters
  • Collaborate with partner teams to design and optimize high-performance GPU kernels for accelerators using HIP, Triton, or other performance-focused frameworks
  • Collaborate with partner teams to build, integrate and tune collective communication libraries (e.g., RCCL) used to parallelize model execution across many GPUs

Requirements:

  • Experience writing or porting GPU kernels using HIP, CUDA, or Triton
  • Familiarity with communication libraries like NCCL/RCCL
  • Experience working on distributed inference systems
  • Ability to solve end-to-end performance challenges across hardware, system libraries, and orchestration layers
  • Ability to thrive in a small, fast-moving team building new infrastructure from first principles

Nice to have:

  • Contributions to open-source libraries like RCCL, Triton, or vLLM
  • Experience with GPU performance tools (Nsight, rocprof, perf) and memory/comms profiling
  • Prior experience deploying inference on other non-NVIDIA GPU environments
  • Knowledge of model/tensor parallelism, mixed precision, and serving 10B+ parameter models
What we offer:
  • Medical, dental, and vision insurance for you and your family, with employer contributions to Health Savings Accounts
  • Pre-tax accounts for Health FSA, Dependent Care FSA, and commuter expenses (parking and transit)
  • 401(k) retirement plan with employer match
  • Paid parental leave (up to 24 weeks for birth parents and 20 weeks for non-birthing parents), plus paid medical and caregiver leave (up to 8 weeks)
  • Paid time off: flexible PTO for exempt employees and up to 15 days annually for non-exempt employees
  • 13+ paid company holidays, and multiple paid coordinated company office closures throughout the year for focus and recharge, plus paid sick or safe time (1 hour per 30 hours worked, or more, as required by applicable state or local law)
  • Mental health and wellness support
  • Employer-paid basic life and disability coverage
  • Annual learning and development stipend to fuel your professional growth
  • Daily meals in our offices, and meal delivery credits as eligible
  • Relocation support for eligible employees
  • Additional taxable fringe benefits, such as charitable donation matching and wellness stipends

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:
PREMIUM
More languages and countries
+ Unlock 31698 hidden job offers
Languages
English Čeština Deutsch Ελληνικά Español Français +15
Countries
United States United Kingdom India Canada Australia +
See plans
Plans from $2.99 / month

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Software Engineer, Inference – AMD GPU Enablement

AI Systems Engineer – AI Model (Training & Inference)

The AMD AI Group is looking for a Senior Software Development Engineer to own th...
Location
Location
Canada , Markham
Salary
Salary:
106400.00 - 159600.00 CAD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Industry experience shipping production AI/ML infrastructure, with hands-on work spanning both training and inference.
  • Bachelor’s or Master’s degree or Ph.D in Computer/Software Engineering, Computer Science, or related technical discipline
Job Responsibility
Job Responsibility
  • Enable and optimize large-scale model training (LLMs, VLMs, MoE architectures) on AMD Instinct GPU clusters, ensuring correctness, reproducibility, and competitive throughput.
  • Build and maintain training infrastructure: job orchestration, distributed checkpointing, data loading pipelines, and storage optimization for multi-thousand GPU clusters on Kubernetes.
  • Debug and resolve training-specific issues including gradient norm explosions, non-deterministic behavior across GPU generations, and compute-communication overlap in distributed training (FSDP, DeepSpeed, Megatron-LM).
  • Optimize RCCL collective communication patterns for training workloads, including all-reduce, all-gather, and reduce-scatter across multi-node topologies.
  • Develop monitoring, alerting, and compliance infrastructure to ensure training cluster health, data security, and SLA adherence at scale.
  • Design and build end-to-end validation and testing infrastructure using proxy workloads, synthetic benchmarks, and configurable workload generators to systematically validate platform readiness across AMD Instinct GPU generations.
  • Write and optimize high-performance GPU kernels (GEMM, attention, quantized matmul, GPTQ/AWQ) in HIP, Triton, and MLIR targeting AMD Instinct architectures, with demonstrated ability to outperform open-source baselines.
  • Drive end-to-end inference enablement on new AMD GPU silicon - be among the first to get frontier models running on each new Instinct generation, creating reproducible guides and reference implementations.
  • Optimize inference serving frameworks (vLLM, SGLang, TorchServe) for AMD GPUs: batching strategies, KV-cache management, speculative decoding, and continuous batching for production throughput/latency targets.
  • Develop novel approaches to inference acceleration, including bio-inspired algorithms, SLM-assisted batching, and custom scheduling strategies that exploit AMD hardware characteristics.
  • Fulltime
Read More
Arrow Right

Sovereign AI Field Application Engineer

We are seeking a Senior Field Application Engineer (FAE) to join the Centre of E...
Location
Location
United Kingdom
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrable hands-on expertise working with either popular AI frameworks and models on GPU
  • Experience leading large technical programs or opportunities
  • Strong systems background. Understands and can quantify the impact of system architecture on performance
  • Strong positive can-do attitude willing to do what is necessary and lead others in the wider FAE team by example. Available to help colleagues
  • Skilled in independently prioritizing opportunities to deliver results on time
  • Excellent verbal and written communication skills
  • Based in Europe ideally EU zone
  • Open to travel both domestic and international, approximately 10-20% over a year. Anticipate a ramp period with increased travel at the start
  • Bachelors' Degree in a technical field (Computer Science, Electrical Engineering, Physics, Mathematics) preferred
Job Responsibility
Job Responsibility
  • Support winning new AI business in national AI and HPC centres. Enabling customers to execute their AI workloads on AMD Instinct GPUs, EPYC CPUs, and AI NICs. Supporting partners in RFP responses by testing requested workloads
  • Owning technical qualification of the customer, partnering with Sales and Business Unit orgs
  • Demonstrate and advise customers and partners through Proof of Concepts, presentations, and training
  • Engineering: execute popular and customer-driven AI inference and training workloads, generate results and create a characteristic understanding of AI performance on AMD hardware. Understand how system and software choices affect performance. Compare performance to our competition
  • Run training and inference performance investigations using common frameworks (Pytorch, Tensorflow, JAX) and using MLperf, Hugging Face etc
  • Build a body of documentation for internal and external dissemination: AMD-internal guides, whitepapers, tuning guides, training collateral
  • Provide onsite training
  • Proactive engagement across AMD teams: GPU Business Unit, Engineering, Architecture, Platform, Software, and Product Development teams providing feedback and leadership from the field on requirements. Gathering missing functionality and working with Engineering to resolve and test
  • Assist in creating Total Cost of Ownership models to aid pricing with bid desk
  • Technically owning and resolving customer and partner issues. Submitting JIRA tickets and driving resolution
  • Fulltime
Read More
Arrow Right

Senior x86 Software FAE

We are seeking an experienced and technically skilled Senior x86 Software FAE wi...
Location
Location
Taiwan , Taipei
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Engineering, Electrical Engineering, Computer Science, or Robotics
  • 8+ years of experience in software engineering, FAE support, or robotics application development
  • Strong understanding of x86 platform architecture, Linux internals, and hardware/software integration
  • Hands-on experience with ROCm, HIP, or CUDA programming, as well as GPU-accelerated computing
  • Familiarity with ROS/ROS 2, perception libraries (OpenCV, PCL), and real-time middleware
  • Proficiency in C/C++, Python, and Linux shell for development and debugging
  • Excellent problem-solving and communication skills with a customer-oriented mindset
Job Responsibility
Job Responsibility
  • Provide technical support and solution enablement for customers using AMD x86 platforms and ROCm GPU stack in robotics and AI workloads
  • Collaborate with engineering teams to optimize AI/ML inference pipelines, vision processing, and motion control frameworks on ROCm-enabled hardware
  • Lead software bring-up, benchmarking, and performance analysis for robotics use cases involving GPU, CPU, and heterogeneous compute
  • Work closely with ODM/OEM partners on ROCm deployment, driver tuning, and software validation
  • Act as the technical liaison between customers and internal software/hardware teams to resolve system-level issues
  • Deliver technical training and workshops on ROCm, HIP, and robotics software stack enablement
  • Contribute to solution collateral, whitepapers, and reference designs targeting industrial robotics and AI applications
Read More
Arrow Right

Director Software Development

At AMD, we are enabling the next generation of AI innovation by leveraging the p...
Location
Location
China , Shanghai
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 10+ years in AI/ML software development
  • 5+ years in leadership roles managing AI model enablement or optimization teams
  • Expertise in optimizing real-time AI models for deep learning applications (computer vision, NLP, etc.)
  • Proficiency with AI frameworks (TensorFlow, PyTorch, ONNX Runtime, JAX, Triton) and their optimization for GPU architectures
  • Strong background in optimizing software for AMD GPUs or similar high-performance platforms
  • Familiarity with ROCm is a plus
  • Proven experience with performance optimization, benchmarking, and scaling AI models on GPUs
  • Exceptional ability to collaborate cross-functionally and define long-term strategies for AI/ML innovation
  • Strong verbal and written communication skills, with experience presenting to senior leadership and working with customers and partners
  • Advanced degree (Master’s or PhD) in Computer Science, Electrical Engineering, AI/ML, or related field
Job Responsibility
Job Responsibility
  • Lead and develop teams responsible for AI inference model enablement and optimization
  • Direct efforts to optimize AI frameworks for seamless compatibility and performance on AMD GPUs (Instinct, Navi)
  • Oversee benchmarking, performance tuning, and optimization of AI inference models to improve latency, throughput, and efficiency on AMD hardware
  • Partner with hardware, software, and QA teams to ensure tight integration of AI frameworks with ROCm for maximum performance
  • Drive AI model optimization innovations, enhancing the speed, efficiency, and scalability of AI workloads
  • Lead the vision and strategy for optimizing AI inference on AMD GPUs
  • Collaborate with customers and open-source communities to ensure that AMD’s AI solutions meet industry needs, fostering contributions to MIGraphX, vLLM, and other AMD AI Framework Inference teams
  • Oversee automation frameworks to streamline model integration and performance testing, ensuring scalability across diverse AI workloads
Read More
Arrow Right

AI Model, Framework, and GPU Engineer

We are looking for an experienced Machine Learning Software Engineer who will be...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong technical and analytical skills in C/C++/Python AI development in Windows and Linux environment
  • Some knowledge on GPU programming and compiler
  • Capable problem solver
  • Technical leader to define goals and scope and drive development effort
  • Good communication skills
  • Enthusiastic about AI technologies
  • Strongly motivated to enable customers with best feature-rich efficient solutions
  • Strong cross-platform software development experience and deep programming skills in C/C++ and Python
  • Excellent problem-solving and effective communication skills
  • Development experience on CONV, GEMM, and/or non-linear operators
Job Responsibility
Job Responsibility
  • Develop and deliver innovative AI software solutions to AMD customers and users
  • Enable and optimize software stack for standard frameworks like ONNX and PyTorch, as well as new popular Open-Source AI software
  • Bring up new SOTA AI models, analyze and improve their performance
  • Participate and drive end-2-end AI software development from feature scoping, implementation, integration and verification, to customer enablement
Read More
Arrow Right

Solution Architect – Physical AI

AMD’s Adaptive Embedded Compute Group (AECG) builds products that combine powerf...
Location
Location
United Kingdom , Belfast
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experienced Electrical, Computer, or Systems Engineer with deep experience architecting AI‑enabled systems for robotics, automation, autonomous machines, or other safety‑critical / mission‑critical systems
  • Hands‑on experience with AI inference hardware (GPU and NPU) and associated software stacks
  • Experience with robotics application frameworks and system integration
  • Demonstrated ability to break down large, complex problems into manageable deliverables and to manage and prioritize requirements across multiple stakeholders
  • BS, MS, or PhD in Computer Science, Computer Engineering, or Electrical Engineering
Job Responsibility
Job Responsibility
  • Partner with silicon planning and platform architecture teams to help define silicon features and software stacks for next‑generation Physical AI systems
  • Collaborate with market segment architects and business leaders to create customer‑focused Physical AI solutions addressing complex requirements across Aerospace, Automotive, Medical, Robotics, Industrial, and Vision markets
  • Architect heterogeneous AI systems (CPU + GPU + NPU, FPGA)
  • Work closely with software engineering and product planning teams to define all aspects of the Physical AI software stack, including ROCm and Ryzen™ AI software support for iGPUs and NPUs, AMD’s Virtualized Automotive Stack, Robot Operating System (ROS), Multimedia analytics pipelines, AI models, and Vision‑Language Models (VLMs) and Large Language Models (LLMs)
  • Evaluate and communicate system‑level tradeoffs and architectural decisions required to deploy AI in real‑time, deterministic, and safety‑constrained environments
What we offer
What we offer
  • Benefits offered are described: AMD benefits at a glance
  • Fulltime
Read More
Arrow Right

Ai application engineer

WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great prod...
Location
Location
China , Shanghai;Shenzhen;Beijing
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Developer enablement with leading open-source communities and AI frameworks, including PyTorch, vLLM, SGLang, Unsloth, PaddlePaddle, Mooncake, TileLang, LangChain, VERL, and LLaMA-Factory, across both training and inference workflows
  • Strong experience with LLMs and Generative AI, including transformer architectures, attention mechanisms, MoE models, and end-to-end AI pipelines
  • Solid understanding of GPU-accelerated computing
  • familiarity with the ROCm AI software stack is strongly preferred
  • Proven ability to collaborate effectively with open-source software communities to drive developer enablement and ecosystem activities
  • Excellent communication and presentation skills, with the ability to clearly articulate architectural proposals, technical trade-offs, and value propositions to diverse stakeholders
  • Bachelor's degree required
  • Master's degree preferred
Job Responsibility
Job Responsibility
  • Capture and prioritize developer and customer requirements to shape AMD's AI software feature planning and solutions roadmap
  • Lead and contribute to collaboration with AI open-source projects, strengthening the developer community and broader ecosystem
  • Partner with internal AI software engineering teams to drive developer enablement through performance optimization, OSS contributions, Discord/GitHub support, AI Academy initiatives, solutions, reference designs, blogs, tutorials, and user guides
  • Work closely with internal AI software teams to ensure the success of AI developers, communities, and customer proof-of-concepts (PoCs)
  • Provide actionable feedback and requirements for AI software across cloud, client, and edge deployments
  • Fulltime
Read More
Arrow Right

Software Engineer II and Senior Software Engineer - Performance

The Artificial Intelligence Performance team at Microsoft develops AI software t...
Location
Location
United States , Mountain View
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Identify and drive improvements to end-to-end inference performance of OpenAI and other state-of-the-art LLMs
  • Measure, benchmark performance on Nvidia/AMD GPUs and first party Microsoft silicon
  • Optimize and monitor performance of LLMs and build SW tooling to enable insights into performance opportunities ranging from the model level to the systems and silicon level to improve customer experience and reduce the footprint of the computing fleet
  • Enable fast time to market of LLMs/models and their deployments at scale by building SW tools that afford velocity in porting models on new Nvidia and AMD GPUs
  • Design, implement, and test functions or components for our AI/DNN/LLM frameworks and tools
  • Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems
  • Communicate and collaborate with our partners both internal and external
  • Embody Microsoft's Culture and Values
  • Fulltime
Read More
Arrow Right