CrawlJobs Logo

Senior AI Models GPU Deployment Software Engineer

amd.com Logo

AMD

Location Icon

Location:
India , Bangalore

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Join AMD and help bring cutting-edge AI models to life on AMD GPUs! We’re looking for someone excited about AI and high-performance computing. In this role, you’ll work with the latest hardware and software technologies to make AI models run faster and more efficiently. You’ll be part of a collaborative team that values learning and innovation.

Job Responsibility:

  • Help run and improve AI models (like Chatbots, Vision, and MultiModal systems) on AMD GPUs
  • Work with popular AI tools like PyTorch and TensorFlow to make them faster on AMD GPUs
  • Collaborate with open-source communities to share improvements
  • Apply good coding practices to build reliable and efficient software

Requirements:

  • Basic understanding of GPU computing (HIP, CUDA, or OpenCL is a plus)
  • Interest in computer architecture and how hardware works
  • Familiarity with AI concepts (Natural Language Processing, Vision, Audio, Recommendations)
  • Programming skills in C++, Python, or similar languages
  • Ability to debug and test your code
  • Bachelor’s degree in Computer Science, Computer Engineering, or a related field

Additional Information:

Job Posted:
April 15, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Senior AI Models GPU Deployment Software Engineer

Senior Manager, Performance AI/ML Network Deployment Engineering

The Senior Manager, DC GPU Advanced Forward Deployment and Systems Engineering i...
Location
Location
United States , Santa Clara
Salary
Salary:
210400.00 - 315600.00 USD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Expertise in networking and performance optimization for large-scale AI/ML networks, including network, compute, storage cluster design, modelling, analytics, performance tuning, convergence, scalability improvements
  • Prefer candidates with solid, hands-on expertise in at least one or more of 3 domains, namely compute, network, storage
  • Experience in working with large customers such as Cloud Service Providers and global enterprise customers
  • Proven leadership in engaging customers with diverse technical disciplines in avenues such as Proof of Concept, Competitive evaluations, Early Field Trials etc
  • Direct experience in working with large customers and can operate with sense of urgency, own the problems and resolve it
  • Demonstrated leadership in network architecture, hands on experience in RoCEv2 Design, VXLAN-EVPN, BGP, and Lossless Fabrics
  • Proven ability to influence design and technology roadmaps, leveraging a deep understanding of datacenter products and market trends
  • Extensive hands-on Network deployment expertise and proven track record of delivering large projects on time. Cisco, Juniper or Arista experience is preferred
  • Direct, co-development/deployment experience in working with strategic customers/partners in bringing solutions to market
  • Excellent communication level from engineer to mid-management to C-level of audience
Job Responsibility
Job Responsibility
  • Collaborate with strategic customers on scalable designs involving compute, networking, storage environment, work with industry partners, Internal teams to accelerate the deployment, adoption of various AI/ML models
  • Engage system-level triage and at-scale debug of complex issues across hardware, firmware, and software, ensuring rapid resolution and system reliability
  • Drive the ramp of Instinct-based large scale AI datacenter infrastructure based on NPI base platform hardware with ROCm, scaling up to pod and cluster level, leveraging the best in network architecture for AI/ML workloads
  • Enhance tools and methodologies for large-scale deployments to meet customer uptime goals and exceed performance expectations
  • Engage with clients to deeply understand their technical needs, ensuring their satisfaction with tailored solutions that leverage your past experience in strategic customer engagements and architectural wins
  • Provide domain specific knowledge to other groups at AMD, share the lessons learnt to drive continuous improvement
  • Engage with AMD product groups to drive resolution of application and customer issues
  • Develop and present training materials to internal audiences, at customer venues, and at industry conferences
Read More
Arrow Right

Senior Software Engineer, AI Platform and Enablement

We're building a next-generation AI-powered platform and web application for cre...
Location
Location
United States , San Francisco
Salary
Salary:
180000.00 - 286000.00 USD / Year
descript.com Logo
Descript
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Experience in deploying and managing AI models in production
  • Experience with the tools of large volume data pipelines like spark, flume, dask, etc.
  • Familiarity with cloud platforms (AWS, Google Cloud, Azure) and container technologies (Docker, Kubernetes)
  • Knowledge of DevOps and MLOps best practices
  • Strong problem-solving abilities and excellent communication skills
Job Responsibility
Job Responsibility
  • Build, maintain, and standardize third-party model integrations, including consulting for other engineering teams with AI model integration needs
  • Design, implement, and maintain our AI infrastructure supporting our machine learning life cycle, including data ingestion pipelines, training developer experience and infrastructure, evaluation frameworks, and deployments / GPU infrastructure
  • Collaborate with Product Managers, Research Engineers, and AI Researchers to understand their infrastructure needs and ensure our AI systems are robust, scalable, and efficient
  • Optimize and scale our models and algorithms for efficient inference
  • Deploy, monitor, and manage AI models in production
What we offer
What we offer
  • Generous healthcare package
  • 401k matching program
  • Catered lunches
  • Flexible vacation time
  • Fulltime
Read More
Arrow Right

Senior System Development Engineer – AI Technologies

Our customers’ system requirements are usually highly complex. Bringing together...
Location
Location
United States , Austin
Salary
Salary:
123000.00 - 170000.00 USD / Year
dell.com Logo
Dell
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor’s or Master’s degree in Computer Engineering, Computer Science, Electrical Engineering, or related field
  • 5+ years of experience in system engineering, platform development, or hardware–software validation
  • Strong understanding of x86 system architecture, CPU/GPU/accelerator internals, memory systems, and I/O subsystems
Job Responsibility
Job Responsibility
  • Lead bring‑up, configuration, and validation of system platforms supporting AI workloads (servers, GPU racks, accelerators, networking fabrics)
  • work with BIOS/UEFI, BMC, firmware, drivers, and kernel subsystems to ensure system readiness for large‑scale AI deployments
  • perform hardware–software co-validation of CPUs, GPUs, DPUs, NICs, accelerators, and memory subsystems under AI‑heavy workloads
  • validate PCIe fabric behavior, NUMA topology, and data‑path efficiency for model training and inference
  • Diagnose complex issues across BIOS, firmware, OS, driver stack, container runtime, orchestration layer, and AI frameworks
  • analyze system logs, kernel traces, hardware event telemetry, GPU health signals, and fabric diagnostics
  • conduct root‑cause analysis of performance bottlenecks, training failures, model divergence, and hardware stability issues
  • collaborate with silicon, firmware, OS, and AI software teams to resolve issues rapidly
  • Deploy and manage AI clusters: GPU servers, accelerators, high‑speed networking (InfiniBand, RoCE), and storage systems
  • validate cluster readiness for distributed training, including bandwidth, latency, topology checks, and gradient‑sync performance
What we offer
What we offer
  • Comprehensive Healthcare Programs
  • Award Winning Financial Wellness Tools and Resources
  • Generous Leave of Absence for New Parents and Caregivers
  • Industry Leading Wellness Platform
  • Employee Assistance Program
  • Fulltime
Read More
Arrow Right

Senior Software Engineer- AI

Are you looking for an opportunity to work with the latest Azure offerings and p...
Location
Location
India , Bangalore
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 7+ years of experience in Software Development
  • Strong programming expertise in one or more languages such as Python, Go, Java, or C#, with experience designing production-grade services and APIs
  • Experience building AI-powered applications, including integrating LLMs, implementing agent or Copilot workflows, and orchestrating multi-step AI interactions
  • Hands-on experience with LLM application frameworks and orchestration tools such as Semantic Kernel, LangChain, or similar agent frameworks
  • Familiarity with retrieval-augmented generation (RAG) architectures, vector databases, embeddings, and semantic search systems
  • Experience evaluating and improving model performance through prompt design, evaluation frameworks, fine-tuning, or feedback loops
  • Solid understanding of distributed systems concepts including scalability, reliability, observability, caching, and asynchronous processing
  • Experience deploying and operating AI workloads in cloud environments (preferably Azure), including containerized services and GPU-enabled infrastructure
  • Understanding of Responsible AI practices, including model governance, safety, privacy, and evaluation of AI behaviour in production systems
  • Ability to work across product, research, and engineering teams to translate product scenarios into scalable AI system architectures
Job Responsibility
Job Responsibility
  • Design, build, and operate scalable AI systems that power intelligent product experiences, including Copilot and agent-driven workflows
  • Architect and implement backend services that support multi-step AI interactions, including orchestration pipelines, context management, memory/state persistence, and tool execution
  • Integrate large language models (LLMs), APIs, and internal services to enable context-aware, human-in-the-loop experiences across customer scenarios
  • Build and maintain data and inference pipelines that support model training, fine-tuning, evaluation, and real-time inference across diverse data sources
  • Evaluate, benchmark, and tune AI/ML models (LLMs and traditional models) to meet product requirements for accuracy, latency, reliability, and safety
  • Implement robust retrieval, grounding, and knowledge integration mechanisms (e.g., RAG systems, semantic indexing, vector search) to power intelligent applications
  • Collaborate with product managers, software engineers, and researchers to translate product vision into production-ready AI capabilities and measurable outcomes
  • Ensure reliability, observability, and governance of AI systems, including monitoring model performance, data quality, and responsible AI practices
  • Build reusable platforms, APIs, and tools that enable teams to rapidly develop AI-powered features and self-service intelligent applications
  • Fulltime
Read More
Arrow Right

Senior AI Software Development Engineer

We are currently seeking a senior, experienced AI Software Engineer to join our ...
Location
Location
Romania , Iasi; Brasov; Bucharest
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Demonstrated experience delivering complex AI or software systems and influencing technical direction within a team
  • Strong understanding of AI/ML concepts and techniques, including deep learning, supervised and unsupervised learning, reinforcement learning, and probabilistic graphical models
  • Familiarity with popular ML frameworks and libraries, such as TensorFlow, PyTorch, Keras, and Scikit-learn
  • Proficient in programming languages such as Python, C++, and Java, with a strong focus on maintainable, high-quality production code
  • Familiarity with AMD's hardware (GPU, CPU, and APU) and software (ROCm, OpenCL, HIP) platforms is a plus, but not required
  • Strong analytical, problem-solving, and critical-thinking skills, with the ability to balance hands-on development with broader technical ownership
  • Excellent written and verbal communication skills, with the ability to effectively communicate complex concepts to a diverse audience
  • Bachelor’s or Master’s degree in Computer Science, Computer/Software Engineering or related technical discipline
Job Responsibility
Job Responsibility
  • Serve as a senior technical contributor, helping define system architecture, development standards, and best practices
  • Provide mentorship and technical guidance to other engineers through design discussions, code reviews, and knowledge sharing
  • Assist in the development of artificial intelligence models, algorithms, and systems tailored to specific project goals and requirements
  • Collaborate effectively with cross-functional teams, including product managers, researchers, hardware engineers, and software developers to support the development of comprehensive AI solutions
  • Learn and adapt to new techniques and methodologies to enhance product performance and develop new features
  • Optimize machine learning models for efficient deployment on AMD hardware and software platforms
  • Contribute to the process of monitoring the performance of deployed models, maintenance and updates, and troubleshooting any related issues
  • Stay current on the latest advancements in the fields of AI and machine learning, collaborating closely with colleagues to foster a culture of innovation
  • Fulltime
Read More
Arrow Right
New

Senior AI Models MAD - Model Automation and Dashboarding Engineer

AMD is looking for a skilled and motivated software engineer to join the Model A...
Location
Location
India , Hyderabad
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Undergraduate and/or Master’s Degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field
  • Strong C/C++/Python programming and software design skills, including debugging, performance analysis, and test design
  • Experience in test automation, CI/CD, and Linux scripting
  • Knowledge of GPU computing (HIP, CUDA, OpenCL)
  • Knowledge of Docker, Kubernetes, or Ansible for testing and deploying AI models and services at scale
  • Proficiency with version control (GitHub), testing strategies, code reviews, and collaborative software development
  • Strong written and verbal communication skills with a proactive approach to defining and driving development efforts
Job Responsibility
Job Responsibility
  • Enable and optimize key AI models (LLM, Vision, MultiModal, etc.) on AMD GPUs
  • Optimize AI frameworks like PyTorch, TensorFlow, etc., on AMD GPUs in upstream open-source repositories
  • Collaborate with internal GPU library teams and open-source framework maintainers to analyze, optimize, and integrate code changes upstream
  • Build and maintain automated functional and performance testing pipelines for AI models across ROCm-supported hardware using scalable tools
  • Develop tools and automation for continuous benchmarking and regression tracking across hardware generations and ROCm releases
  • Build and maintain real-time dashboards that report relevant performance, accuracy, and reliability metrics
  • Support public-facing MAD GitHub repositories and Docker releases, enabling the community to run and validate models on ROCm
  • Contribute to the design of portable, easy-to-use Python interfaces that support multi-node profiling, distributed workloads, and containerized deployments
Read More
Arrow Right

Senior Software Engineer

The AI & Innovation team at Microsoft Suzhou is seeking a highly motivated Senio...
Location
Location
China , Beijing
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science, Electrical Engineering, or related technical field AND 4+ years of technical engineering experience with coding in languages such as Python, C++, or C#
  • OR equivalent industry experience
  • 7+ years of software engineering experience with a focus on AI/ML systems
  • Proven experience with one or more of the following: Developing or applying generative AI models
  • Building and optimizing inference pipelines for large AI models on cloud infrastructure
  • Integrating AI features into consumer-facing web or mobile applications at scale
  • Working with programmatic advertising ecosystems
  • Familiarity with cloud services (Azure preferred), microservices architecture, and DevOps practices
  • Hands-on experience in at least two of the three core areas: AI/ML Prototyping: Experience with deep learning frameworks (PyTorch, TensorFlow) and implementing/tuning models from recent literature
  • Video/Graphics Processing: Experience with video codecs (FFmpeg), computer graphics, GPU programming (CUDA), or real-time media pipelines
Job Responsibility
Job Responsibility
  • Rapid AI Prototyping: Design, build, and iterate on high-potential prototypes for AI-powered video generation, editing, and content understanding
  • System Integration & Productionization: Bridge the gap between research prototypes and production-ready systems
  • Integrate AI video generation capabilities with large-scale advertising platforms and consumer products
  • Full-Stack Development: Develop end-to-end solutions encompassing backend AI service APIs, model inference optimization, and frontend interfaces
  • Cross-Functional Collaboration: Work closely with Applied Scientists, Machine Learning Engineers, Product Managers, and Ads Platform teams
  • Technical Leadership: Drive architectural decisions for scalable, reliable, and cost-effective AI service deployment
  • Mentor junior engineers and promote engineering best practices
  • Live Site Ownership: Participate in on-call rotations and act as a Designated Responsible Individual (DRI) to ensure the health, performance, and reliability of services
  • Fulltime
Read More
Arrow Right

Software Engineer II and Senior Software Engineer - Performance

The Artificial Intelligence Performance team at Microsoft develops AI software t...
Location
Location
United States , Mountain View
Salary
Salary:
100600.00 - 199000.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Job Responsibility
Job Responsibility
  • Identify and drive improvements to end-to-end inference performance of OpenAI and other state-of-the-art LLMs
  • Measure, benchmark performance on Nvidia/AMD GPUs and first party Microsoft silicon
  • Optimize and monitor performance of LLMs and build SW tooling to enable insights into performance opportunities ranging from the model level to the systems and silicon level to improve customer experience and reduce the footprint of the computing fleet
  • Enable fast time to market of LLMs/models and their deployments at scale by building SW tools that afford velocity in porting models on new Nvidia and AMD GPUs
  • Design, implement, and test functions or components for our AI/DNN/LLM frameworks and tools
  • Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems
  • Communicate and collaborate with our partners both internal and external
  • Embody Microsoft's Culture and Values
  • Fulltime
Read More
Arrow Right