Software Development Engineer Job at AMD (Shanghai)

Software Development Engineer

AMD

Location:
China , Shanghai

Category:
IT - Software Development

Contract Type:
Employment contract

Salary:

Not provided

Save Job

Apply Position

Job Description:

We are looking for a dynamic, upbeat software engineer to join our growing team. Your work will focus on building robust, efficient software components that enable high-performance execution of large language models and multimodal models across multi-GPU systems. You’ll collaborate with internal GPU library teams and open-source maintainers to implement features that improve throughput, latency, and scalability. This role emphasizes full-stack development within AI inference systems, with a strong focus on model behavior and framework integration.

Job Responsibility:

Deep Learning & LLM Framework Optimization for AMD GPUs
Model-Aware Implementation with LLMs and multimodal architectures
Performance-Conscious Coding in multi-GPU environments
Profiling using tools to evaluate impact of changes
End-to-End Performance Engineering across multi-GPU and multi-node setups
Compiler & Pipeline Acceleration using compiler technologies and graph compilers
Research & Advanced Techniques like speculative decoding and weight-only quantization
Cross-Team & Open-Source Collaboration with internal GPU library teams and open-source maintainers
Software Engineering Excellence for maintainable and production-quality performance optimizations

Requirements:

Familiarity in Python
Familiarity with C++ or async programming
Understanding of LLM or multimodal model concepts
Knowledge of transformer architectures, attention mechanisms, vision-language alignment, and inference pipelines
Theoretical grounding in Transformer/Attention/MoE/KV Cache, and quantization (FP8/FP4)
Linux development environment
Experience with profiling and diagnosing compute, memory, and communication bottlenecks across multi-GPU and multi-node environments
Solid Python/C++ coding skills and experience debugging and testing practices
Experience with multimodal models (e.g., Qwen-VL, Qwen-Image-Edit, Wan) or diffusion-based generative models
Familiarity with techniques like quantization, PagedAttention, continuous batching, or speculative decoding
Exposure to GPU computing (ROCm, CUDA) or performance profiling tools (e.g., PyTorch Profiler)
Experience with distributed inference for large-scale models (e.g., Tensor Parallel, Pipeline Parallel)
Bachelor's in Computer Science, Computer Engineering, Electrical Engineering, or a related field

Nice to have:

Familiarity with C++ or async programming
GPU Kernel Development & Optimization using HIP, CUDA, ASM, and tools like CK, CUTLASS, and Triton
Compiler & System-Level Optimization knowledge of LLVM, ROCm, and compiler-driven techniques
software engineering excellence & community contribution

What we offer:

AMD benefits at a glance

Additional Information:

Job Posted:
April 23, 2026

Employment Type:

Fulltime

Work Type:

On-site work

AMD - All Job Offers

Job Link Share: