This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
AMD is looking for a skilled and motivated software engineer to join the Model Automation and Dashboarding (Framework MAD) team — a group focused on ensuring the reliability, performance, and scalability of AI models running on AMD hardware. As part of this team, you’ll build and maintain tools and infrastructure that automate functional and performance validation of deep learning models across ROCm and GPU platforms. Your contributions will directly impact developer confidence, model portability, and transparent benchmarking for internal teams and the open-source community.
Job Responsibility
Enable and optimize key AI models (LLM, Vision, MultiModal, etc.) on AMD GPUs
Optimize AI frameworks like PyTorch, TensorFlow, etc., on AMD GPUs in upstream open-source repositories
Collaborate with internal GPU library teams and open-source framework maintainers to analyze, optimize, and integrate code changes upstream
Build and maintain automated functional and performance testing pipelines for AI models across ROCm-supported hardware using scalable tools
Develop tools and automation for continuous benchmarking and regression tracking across hardware generations and ROCm releases
Build and maintain real-time dashboards that report relevant performance, accuracy, and reliability metrics
Support public-facing MAD GitHub repositories and Docker releases, enabling the community to run and validate models on ROCm
Contribute to the design of portable, easy-to-use Python interfaces that support multi-node profiling, distributed workloads, and containerized deployments
Requirements
Undergraduate and/or Master’s Degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field
Strong C/C++/Python programming and software design skills, including debugging, performance analysis, and test design
Experience in test automation, CI/CD, and Linux scripting
Knowledge of GPU computing (HIP, CUDA, OpenCL)
Knowledge of Docker, Kubernetes, or Ansible for testing and deploying AI models and services at scale
Proficiency with version control (GitHub), testing strategies, code reviews, and collaborative software development
Strong written and verbal communication skills with a proactive approach to defining and driving development efforts
Nice to have
AI model experience or knowledge in Natural Language Processing, Vision, Audio, Recommendation systems
Experience with machine learning frameworks, performance dashboards, or automation platforms
Experience with profiling tools, system monitoring, or regression tracking systems for deep learning models