Senior AI Models GPU Deployment Software Engineer Job at AMD (Bangalore)

Software Engineer II and Senior Software Engineer - Performance

The Artificial Intelligence Performance team at Microsoft develops AI software t...

Location

United States , Mountain View

Salary:

100600.00 - 199000.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 2+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Job Responsibility

Identify and drive improvements to end-to-end inference performance of OpenAI and other state-of-the-art LLMs
Measure, benchmark performance on Nvidia/AMD GPUs and first party Microsoft silicon
Optimize and monitor performance of LLMs and build SW tooling to enable insights into performance opportunities ranging from the model level to the systems and silicon level to improve customer experience and reduce the footprint of the computing fleet
Enable fast time to market of LLMs/models and their deployments at scale by building SW tools that afford velocity in porting models on new Nvidia and AMD GPUs
Design, implement, and test functions or components for our AI/DNN/LLM frameworks and tools
Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems
Communicate and collaborate with our partners both internal and external
Embody Microsoft's Culture and Values

Fulltime

Senior Software Engineer- AI

Are you looking for an opportunity to work with the latest Azure offerings and p...

Location

India , Bangalore

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

7+ years of experience in Software Development
Strong programming expertise in one or more languages such as Python, Go, Java, or C#, with experience designing production-grade services and APIs
Experience building AI-powered applications, including integrating LLMs, implementing agent or Copilot workflows, and orchestrating multi-step AI interactions
Hands-on experience with LLM application frameworks and orchestration tools such as Semantic Kernel, LangChain, or similar agent frameworks
Familiarity with retrieval-augmented generation (RAG) architectures, vector databases, embeddings, and semantic search systems
Experience evaluating and improving model performance through prompt design, evaluation frameworks, fine-tuning, or feedback loops
Solid understanding of distributed systems concepts including scalability, reliability, observability, caching, and asynchronous processing
Experience deploying and operating AI workloads in cloud environments (preferably Azure), including containerized services and GPU-enabled infrastructure
Understanding of Responsible AI practices, including model governance, safety, privacy, and evaluation of AI behaviour in production systems
Ability to work across product, research, and engineering teams to translate product scenarios into scalable AI system architectures

Job Responsibility

Design, build, and operate scalable AI systems that power intelligent product experiences, including Copilot and agent-driven workflows
Architect and implement backend services that support multi-step AI interactions, including orchestration pipelines, context management, memory/state persistence, and tool execution
Integrate large language models (LLMs), APIs, and internal services to enable context-aware, human-in-the-loop experiences across customer scenarios
Build and maintain data and inference pipelines that support model training, fine-tuning, evaluation, and real-time inference across diverse data sources
Evaluate, benchmark, and tune AI/ML models (LLMs and traditional models) to meet product requirements for accuracy, latency, reliability, and safety
Implement robust retrieval, grounding, and knowledge integration mechanisms (e.g., RAG systems, semantic indexing, vector search) to power intelligent applications
Collaborate with product managers, software engineers, and researchers to translate product vision into production-ready AI capabilities and measurable outcomes
Ensure reliability, observability, and governance of AI systems, including monitoring model performance, data quality, and responsible AI practices
Build reusable platforms, APIs, and tools that enable teams to rapidly develop AI-powered features and self-service intelligent applications

Fulltime

Senior AI Software Development Engineer

We are currently seeking a senior, experienced AI Software Engineer to join our ...

Location

Romania , Iasi; Brasov; Bucharest

Salary:

Not provided

AMD

Expiration Date

Until further notice

Requirements

Demonstrated experience delivering complex AI or software systems and influencing technical direction within a team
Strong understanding of AI/ML concepts and techniques, including deep learning, supervised and unsupervised learning, reinforcement learning, and probabilistic graphical models
Familiarity with popular ML frameworks and libraries, such as TensorFlow, PyTorch, Keras, and Scikit-learn
Proficient in programming languages such as Python, C++, and Java, with a strong focus on maintainable, high-quality production code
Familiarity with AMD's hardware (GPU, CPU, and APU) and software (ROCm, OpenCL, HIP) platforms is a plus, but not required
Strong analytical, problem-solving, and critical-thinking skills, with the ability to balance hands-on development with broader technical ownership
Excellent written and verbal communication skills, with the ability to effectively communicate complex concepts to a diverse audience
Bachelor’s or Master’s degree in Computer Science, Computer/Software Engineering or related technical discipline

Job Responsibility

Serve as a senior technical contributor, helping define system architecture, development standards, and best practices
Provide mentorship and technical guidance to other engineers through design discussions, code reviews, and knowledge sharing
Assist in the development of artificial intelligence models, algorithms, and systems tailored to specific project goals and requirements
Collaborate effectively with cross-functional teams, including product managers, researchers, hardware engineers, and software developers to support the development of comprehensive AI solutions
Learn and adapt to new techniques and methodologies to enhance product performance and develop new features
Optimize machine learning models for efficient deployment on AMD hardware and software platforms
Contribute to the process of monitoring the performance of deployed models, maintenance and updates, and troubleshooting any related issues
Stay current on the latest advancements in the fields of AI and machine learning, collaborating closely with colleagues to foster a culture of innovation

Fulltime

Senior AI Models MAD - Model Automation and Dashboarding Engineer

AMD is looking for a skilled and motivated software engineer to join the Model A...

Location

India , Hyderabad

Salary:

Not provided

AMD

Expiration Date

Until further notice

Requirements

Undergraduate and/or Master’s Degree in Computer Science, Computer Engineering, Electrical Engineering, or a related field
Strong C/C++/Python programming and software design skills, including debugging, performance analysis, and test design
Experience in test automation, CI/CD, and Linux scripting
Knowledge of GPU computing (HIP, CUDA, OpenCL)
Knowledge of Docker, Kubernetes, or Ansible for testing and deploying AI models and services at scale
Proficiency with version control (GitHub), testing strategies, code reviews, and collaborative software development
Strong written and verbal communication skills with a proactive approach to defining and driving development efforts

Job Responsibility

Enable and optimize key AI models (LLM, Vision, MultiModal, etc.) on AMD GPUs
Optimize AI frameworks like PyTorch, TensorFlow, etc., on AMD GPUs in upstream open-source repositories
Collaborate with internal GPU library teams and open-source framework maintainers to analyze, optimize, and integrate code changes upstream
Build and maintain automated functional and performance testing pipelines for AI models across ROCm-supported hardware using scalable tools
Develop tools and automation for continuous benchmarking and regression tracking across hardware generations and ROCm releases
Build and maintain real-time dashboards that report relevant performance, accuracy, and reliability metrics
Support public-facing MAD GitHub repositories and Docker releases, enabling the community to run and validate models on ROCm
Contribute to the design of portable, easy-to-use Python interfaces that support multi-node profiling, distributed workloads, and containerized deployments

Senior Software Engineer, CoreAI Workload Engines

The CoreAI Workloads team builds the foundational inference engines and APIs tha...

Location

United States , Redmond

Salary:

119800.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field and 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, Python, or equivalent experience.
Proven ability to design and operate large-scale, production inference services with high reliability and performance requirements, and to ship performance improvements safely via disciplined experimentation.
Strong skills in performance analysis: benchmarking, profiling, diagnosing regressions, and turning results into concrete engine/runtime changes.
Strong problem-solving skills and the ability to debug complex, cross layer systems issues.
Demonstrated technical leadership, including mentoring engineers, driving cross-team architectural alignment, and leveraging AI tools and AI-assisted workflows to accelerate engineering velocity and quality.
Hands-on experience with Kubernetes (building and operating services on k8s), including debugging production issues and designing platform abstractions (e.g., custom resources/controllers) and scheduling-aware deployments (e.g., node affinity, taints/tolerations, resource requests/limits).
Strong collaboration and communication skills, with the ability to work across organizational boundaries.

Job Responsibility

Optimize inference engines for OpenAI and open-source models by implementing and shipping performance/efficiency improvements across runtime, scheduling, and serving paths (latency, throughput, utilization, availability, and cost).
Run experiments end-to-end: formulate hypotheses, implement engine changes (including Python/PyTorch integration points where relevant), analyze results, and ship improvements behind guardrails.
Build and use experimentation capabilities for large-scale AI inference (experiment lifecycle, tracking, metric modeling, comparability standards, automated analysis) so the team can iterate quickly and safely.
Own serving availability and efficiency for Azure OpenAI Service workloads through tiered experimentation, lean segmentation, and multi-modal utilization across heterogeneous fleets—turning findings into shipped engine improvements.
Design and evolve inference serving architectures to improve utilization and latency using techniques such as disaggregated serving, multi-token prediction, KV offload/retrieval, and quantization—validated via staged rollouts and production guardrails.
Extend AI infrastructure abstractions to support elastic, heterogeneous inference engines reliably at scale (e.g., dynamic scaling across model families, modalities, and workload classes while maintaining isolation and SLOs).
Tune and scale inference engines across NVIDIA GPU generations (A100, H100, H200) for state-of-the-art OpenAI models, focusing on serving efficiency, utilization, and reliability (not hardware bring-up).
Partner with networking and storage teams to leverage high-performance interconnects (e.g., RDMA/InfiniBand-class fabrics such as RoCE over IB) for distributed inference, without owning low-level kernel/driver enablement.
Drive end-to-end features from design through production: observability, diagnostics, performance regression detection, and operational excellence for inference serving.
Influence platform architecture and technical direction across teams through design reviews, clear metrics, and technical leadership focused on experimentation velocity and production reliability.

What we offer

Benefits and other compensation

Fulltime

Senior Software Engineer

The AI & Innovation team at Microsoft Suzhou is seeking a highly motivated Senio...

Location

China , Beijing

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science, Electrical Engineering, or related technical field AND 4+ years of technical engineering experience with coding in languages such as Python, C++, or C#
OR equivalent industry experience
7+ years of software engineering experience with a focus on AI/ML systems
Proven experience with one or more of the following: Developing or applying generative AI models
Building and optimizing inference pipelines for large AI models on cloud infrastructure
Integrating AI features into consumer-facing web or mobile applications at scale
Working with programmatic advertising ecosystems
Familiarity with cloud services (Azure preferred), microservices architecture, and DevOps practices
Hands-on experience in at least two of the three core areas: AI/ML Prototyping: Experience with deep learning frameworks (PyTorch, TensorFlow) and implementing/tuning models from recent literature
Video/Graphics Processing: Experience with video codecs (FFmpeg), computer graphics, GPU programming (CUDA), or real-time media pipelines

Job Responsibility

Rapid AI Prototyping: Design, build, and iterate on high-potential prototypes for AI-powered video generation, editing, and content understanding
System Integration & Productionization: Bridge the gap between research prototypes and production-ready systems
Integrate AI video generation capabilities with large-scale advertising platforms and consumer products
Full-Stack Development: Develop end-to-end solutions encompassing backend AI service APIs, model inference optimization, and frontend interfaces
Cross-Functional Collaboration: Work closely with Applied Scientists, Machine Learning Engineers, Product Managers, and Ads Platform teams
Technical Leadership: Drive architectural decisions for scalable, reliable, and cost-effective AI service deployment
Mentor junior engineers and promote engineering best practices
Live Site Ownership: Participate in on-call rotations and act as a Designated Responsible Individual (DRI) to ensure the health, performance, and reliability of services

Fulltime

Senior Software Engineer - Performance Tooling

The Artificial Intelligence (AI) Frameworks team at Microsoft develops AI softwa...

Location

United States , Redmond

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C++, or Python OR equivalent experience
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. This includes passing the Microsoft Cloud background check upon hire/transfer and every two years thereafter
Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C++, or Python OR Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C++, or Python OR equivalent experience
4+ years’ practical experience working on high performance applications and performance debugging and optimization on CPUs/GPUs
Experience in DNN/LLM inference and experience in one or more DL frameworks such as PyTorch, Tensorflow, or ONNX Runtime and familiarity with CUDA, ROCm, Triton
Technical background and solid foundation in software engineering principles, computer architecture, GPU architecture, hardware neural net acceleration
Experience in end-to-end performance analysis and optimization of state of the art LLMs and HPC applications, including proficiency using GPU profiling tools
Cross-team collaboration skills and the desire to collaborate in a team of researchers and developers
Ability to independently lead projects

Job Responsibility

Work across multiple layers of the AI software stack (abstractions, programming models, compilers, runtimes, libraries, and APIs) to enable large-scale model training and inference
Benchmark OpenAI and other LLMs for performance on GPUs and Microsoft hardware
Debug, profile, and optimize performance for training/inference workloads on Central Processing Units (CPUs)/Graphics Processing Units (GPUs)
Monitor performance regressions and drive continuous improvements to reduce time-to-deploy and hardware footprint
Collaborate across teams of researchers and engineers to deliver scalable, production-ready AI performance improvements

Fulltime

Senior Software Engineer - Performance

The Artificial Intelligence Performance team at Microsoft develops AI software t...

Location

United States , Mountain View

Salary:

119800.00 - 234700.00 USD / Year

Microsoft Corporation

Expiration Date

Until further notice

Requirements

Bachelor's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python OR equivalent experience.
Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include but are not limited to the following specialized security screenings: Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter.

Job Responsibility

Identify and drive improvements to end-to-end inference performance of OpenAI and other state-of-the-art LLMs
Measure, benchmark performance on Nvidia/AMD GPUs and first party Microsoft silicon
Optimize and monitor performance of LLMs and build SW tooling to enable insights into performance opportunities ranging from the model level to the systems and silicon level to improve customer experience and reduce the footprint of the computing fleet
Enable fast time to market of LLMs/models and their deployments at scale by building SW tools that afford velocity in porting models on new Nvidia and AMD GPUs
Design, implement, and test functions or components for our AI/DNN/LLM frameworks and tools
Speeding up/reducing complexity of key components/pipelines to improve performance and/or efficiency of our systems
Communicate and collaborate with our partners both internal and external
Embody Microsoft's Culture and Values

Fulltime

Select Country

Senior AI Models GPU Deployment Software Engineer

Job Description

Job Responsibility

Requirements

Looking for more opportunities?