CrawlJobs Logo

Lead AI Compiler Engineer

amd.com Logo

AMD

Location Icon

Location:
India , Hyderabad

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

We are looking for an AI Compiler Engineer to join this high impact team working in the growing field of on-device AI inference acceleration as an individual contributor or as a technical lead in the AI group (AIG). As an AI Compiler Engineer, you will design and optimize AI compiler stack and tools that enable efficient execution of state-of-the-art open source as well as proprietary AI models such as LLMs, transformer models, etc., to AMD NPUs for on-device AI inference use-cases.. You will work on transforming high-level AI models into efficient, low-level code that can run on NPU. Your work will directly impact the performance, efficiency, and scalability of our AI solutions.

Job Responsibility:

  • Graph transformation
  • Constant folding
  • Operator fusion: Identify and implement performance optimization opportunities by reducing memory traffic through operator fusion at different memory hierarchy levels e.g., attention block
  • Common subexpression elimination
  • Problem partitioning and dataflow orchestration: Design of algorithms to optimally map given AI operation to the NPU comprising of an interconnected array of AI engines
  • Design and implementation of algorithms to orchestrate dataflow through multi-level memory hierarchy
  • Kernel Design and Development: Design and implement highly optimized C++/intrinsic based kernels for AI related operators
  • Develop vectorized code that leverages SIMD (Single Instruction, Multiple Data) and VLIW (Very Long Instruction Word) for optimal performance
  • Perform performance, program memory and accuracy tradeoffs
  • Testing and Validation: Develop CPU models for the ML operators in C++/ Python to validate accuracy
  • Write unit tests and integration tests to ensure correctness and reliability
  • Performance Profiling and Tuning: Profile and analyze the performance of model layers
  • Identify performance/accuracy bottlenecks and alleviate those
  • Documentation and Collaboration: Effective technical communication of day-to-day work and document design specs
  • Follow good coding practices, using version control system
  • Collaborate with cross-functional teams spanning over AI research, core architecture and software engineering

Requirements:

  • Excellent C/C++ and Python coding skills
  • Good understanding of SIMD, VLIW processor architecture
  • Experience with vectorized programming (SIMD)
  • Thorough understanding of fixed and floating point arithmetic
  • Good understanding of various operators in state-of-the-art AI models
  • Knowledge of low-level hardware details (cache hierarchy, DMA programming)
  • Excellent problem-solving skills especially on debug and a passion for on-device AI
  • Prefer candidates with past experience on AI compiler design
  • BS/Masters/PhD degree in Computer Science, Electrical Engineering, or a related field

Additional Information:

Job Posted:
March 03, 2026

Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Lead AI Compiler Engineer

Research Engineer AI

The role involves conducting high-quality research in AI and HPC, shaping future...
Location
Location
United Kingdom , Bristol
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • A good working knowledge of AI/ML frameworks, at least TensorFlow and PyTorch, as well as the data preparation, handling, and lineage control, as well as model deployment, in particular in a distributed environment
  • At least a B.Sc. equivalent in a Science, Technology, Engineering or Mathematical discipline
  • Development experience in compiled languages such as C, C++ or Fortran and experience with interpreted environments such as Python
  • Parallel programming experience, with relevant programming models such as OpenMP, MPI, CUDA, OpenACC, HIP, PGAS languages is highly desirable
Job Responsibility
Job Responsibility
  • Perform world-class research while also shaping products of the future
  • Enable high performance AI software stacks on supercomputers
  • Provide new environments/abstractions to support application developers to build, deploy, and run AI applications taking advantage of leading-edge hardware at scale
  • Manage modern data-intensive AI training and inference workloads
  • Port and optimize workloads of key research centers like the AI safety institute
  • Support onboarding and scaling of domain-specific applications
  • Foster collaboration with the UK and European research community
What we offer
What we offer
  • Health & Wellbeing benefits that support physical, financial and emotional wellbeing
  • Career development programs catered to achieving career goals
  • Unconditional inclusion in the workplace
  • Flexibility to manage work and personal needs
  • Fulltime
Read More
Arrow Right

AI Research Engineer, Scaling

As a Research Engineer focused on Scaling, you will design and build robust infr...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming experience in Python and/or C++
  • Deep intuitive understanding of training and inference speed bottlenecks and scaling laws
  • A mindset aligned with extremely high scaling: belief that scale is foundational to enabling humanoid robotics
  • Degree in Computer Science or a related field
  • Experience with distributed training frameworks (e.g., TorchTitan, DeepSpeed, FSDP/ZeRO), multi-node debugging, and experiment management
  • Proven skills in optimizing inference performance using graph compilers, batching/scheduling, and serving systems like TensorRT or equivalents
  • Familiarity with quantization strategies (PTQ, QAT, INT8/FP8) and tools such as TensorRT and bitsandbytes
  • Experience developing or tuning CUDA or Triton kernels with understanding of hardware-level optimization (vectorization, tensor cores, memory hierarchies)
Job Responsibility
Job Responsibility
  • Own and lead scaling of distributed training and inference systems
  • Ensure compute resources are optimized to make data the primary constraint
  • Enable massive training runs (1000+ GPUs) using robot data, with robust fault tolerance, experiment tracking, and distributed operations
  • Optimize inference throughput for datacenter use cases such as world models and diffusion engines
  • Reduce latency and enhance performance for on-device robot policies using techniques such as quantization, scheduling, and distillation
What we offer
What we offer
  • Equity
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

Senior Research Engineer

The HPE HPC & AI EMEA Research Lab (ERL) is characterized by a unique blend of i...
Location
Location
Germany , Munich, Berlin
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Development experience in compiled languages such as C, C++ or Fortran and experience with interpreted environments such as Python
  • At least a B.Sc. equivalent in a Science, Technology, Engineering or Mathematical discipline
  • Parallel programming experience, with programming models such as OpenMP, MPI, CUDA, OpenACC, HIP, PGAS languages, etc.
  • An understanding of AI/ML frameworks, experience with frameworks such as TensorFlow or PyTorch is highly desirable
  • An interest in system- and data center monitoring and operational data analysis
  • Professional language skills in English and German
Job Responsibility
Job Responsibility
  • Perform world-class research while also shaping products of the future
  • Work with the most esteemed research partners across Europe
  • Enable high performance research software on pre-Exascale and Exascale supercomputers
  • Provide new environments/abstractions to support application developers to build, deploy, and run applications taking advantage of leading-edge hardware at scale
  • Make and operate HPC/AI systems and datacenters in a sustainable way
  • Manage modern data-intensive workloads in high performance environments
What we offer
What we offer
  • Competitive salary and extensive benefits package (pension scheme, insurances, bike and car leasing, and other fringe benefits)
  • Work-life balance (flexible working time and hybrid workplace model, 30 vacation days, four HPE Wellness-Fridays, up to six months paid parental leave)
  • Support for education, training, and career development
  • Diverse and dynamic work environment
Read More
Arrow Right

Research Engineer, Scaling

As a Research Engineer, Scaling, you will design and build infrastructure to sup...
Location
Location
United States , Palo Alto
Salary
Salary:
180000.00 - 300000.00 USD / Year
1x.tech Logo
1X Technologies
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming experience in Python and/or C++
  • Deep intuitive understanding of what affects training or inference speed: from bottlenecks to scaling laws
  • A mindset aligned with extremely high scaling: belief that scale is foundational to enabling humanoid robotics
  • Degree in Computer Science or a related field
  • Hands‑on experience with distributed training frameworks (e.g., TorchTitan, DeepSpeed, FSDP/ZeRO), multi‑node debugging, experiment management
  • Proven skills optimizing inference performance: graph compilers, batching/scheduling, serving systems (e.g., using TensorRT or equivalents)
  • Familiarity with quantization strategies: PTQ, QAT, INT8/FP8
  • tools like TensorRT, bitsandbytes, etc.
  • Experience writing or tuning CUDA or Triton kernels
  • understanding of hardware features like vectorization, tensor cores, and memory hierarchies
Job Responsibility
Job Responsibility
  • Own and lead scaling of both distributed training and inference systems
  • Ensure compute resources are sufficient so that data, not hardware, is the limiter
  • Enable massive training at scale (1000+ GPUs) on robot data, handling fault tolerance, experiment tracking, distributed operations, and large datasets
  • Optimize inference throughput in datacenter contexts (e.g., for world models and diffusion engines)
  • Reduce latency and optimize performance for on‑device robot policies through techniques like quantization, scheduling, distillation, etc.
What we offer
What we offer
  • Health, dental, and vision insurance
  • 401(k) with company match
  • Paid time off and holidays
  • Fulltime
Read More
Arrow Right

Software Engineer, Systems ML - Frameworks / Compilers / Kernels

In this role, you will be a member of the MTIA (Meta Training & Inference Accele...
Location
Location
United States , Menlo Park
Salary
Salary:
181000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven C/C++ programming skills
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta.
  • Experience in AI framework development or accelerating deep learning models on hardware architectures.
Job Responsibility
Job Responsibility
  • Development of SW stack with one of the following core focus areas: AI frameworks, compiler stack, high performance kernel development and acceleration onto next generation of hardware architectures.
  • Contribute to the development of the industry-leading PyTorch AI framework core compilers to support new state of the art inference and training AI hardware accelerators and optimize their performance.
  • Analyze deep learning networks, develop & implement compiler optimization algorithms.
  • Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc.
  • Performance tuning and optimizations of deep learning framework & software components.
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right
New

Software Engineer, Systems ML - Frameworks / Compilers / Kernels (PhD)

In this role, you will be a member of the MTIA (Meta Training & Inference Accele...
Location
Location
United States , Bellevue
Salary
Salary:
181000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Proven C/C++ programming skills
  • Currently has, or is in the process of obtaining a Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience. Degree must be completed prior to joining Meta
  • Experience in AI framework development or accelerating deep learning models on hardware architectures
Job Responsibility
Job Responsibility
  • Development of SW stack with one of the following core focus areas: AI frameworks, compiler stack, high performance kernel development and acceleration onto next generation of hardware architectures
  • Contribute to the development of the industry-leading PyTorch AI framework core compilers to support new state of the art inference and training AI hardware accelerators and optimize their performance
  • Analyze deep learning networks, develop & implement compiler optimization algorithms
  • Collaborating with AI research scientists to accelerate the next generation of deep learning models such as Recommendation systems, Generative AI, Computer vision, NLP etc
  • Performance tuning and optimizations of deep learning framework & software components
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

Principal Quantum Systems Software Development Engineer

As a Principal Quantum Systems Software Development Engineer in our Quantum Syst...
Location
Location
United States , Redmond
Salary
Salary:
139900.00 - 274800.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in Computer Science, Software Engineering, or related field AND 3+ years software industry experience, including developing commercial software, compilers, scientific computing applications, or multi-component systems
  • Master's Degree in Computer Science, Software Engineering, or related field AND 4+ years software industry experience, including developing commercial software, compilers, scientific computing applications
  • Bachelor's Degree in Computer Science, Software Engineering, or related field AND 6+ years software industry experience, including developing commercial software, compilers, scientific computing applications
  • equivalent experience
  • 6+ years programming experience in related programming languages
  • 6+ years experience in a collaborative environment
  • Ability to meet Microsoft, customer and/or government security screening requirements
  • Ability to work in an "AI first" environment using modern AI tools to accelerate discovery through hardware development
  • Familiarity with designing and building AI agents/copilots that assist with experiment setup, log triage, measurement report generation, protocol templating, and knowledge retrieval
Job Responsibility
Job Responsibility
  • Integrate the topological qubit platform with Microsoft’s quantum software stack
  • Define and evolve interfaces between device control/readout, error‑syndrome pipelines, QIR/QDK toolchains, and Azure services
  • Drive the software architecture and technical roadmap for scale‑up
  • Lead multi‑year design for control, decoding, and orchestration systems that support progressively larger topological QPUs and higher logical‑qubit counts
  • Design, implement, integrate, and test major system components
  • Ship production‑quality services, runtimes, and APIs spanning device orchestration, calibration & tuning automation, data pipelines, observability, and reliability
  • Use AI every day to go faster and improve quality
  • Apply Copilot/LLM workflows for design reviews, code generation, test authoring, telemetry triage, and experiment planning
  • establish team guardrails for responsible AI use in engineering
  • Lead and mentor
  • Fulltime
Read More
Arrow Right

Lead AI Researcher

Join the AI Research & Solutions Group. We're looking for an engineer who builds...
Location
Location
United States , Menlo Park
Salary
Salary:
Not provided
outsystems.com Logo
OutSystems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years building AI that shipped
  • Trained real models: Architecture design through deployment—not fine-tuning tutorials, actual model development
  • Deep ML fundamentals: Backprop, attention, loss functions, optimization—at a level where you can debug them
  • Production engineering: Expert Python, PyTorch fluency, distributed training, GPU programming, ML infrastructure at scale
  • Dataset intuition: You've built datasets and can spot data issues from model behavior—you know data quality beats model size
  • Agent architecture: Built systems that plan, act, and adapt—agentic loops, tool-calling, multi-step reasoning pipelines
Job Responsibility
Job Responsibility
  • Build & Train Models: Own the full pipeline: architecture design → dataset engineering → distributed training → production deployment
  • Implement state-of-the-art techniques from papers—often before official implementations exist—translating math into working PyTorch code
  • Build experimentation infrastructure that enables rapid iteration on architectures, training regimes, and evaluation methods
  • Build Agent Systems: Create orchestration loops that enable AI to reason across multi-file changes, invoke tools (compilers, test runners, debuggers), and recover from failures
  • Implement RL for code generation: execution-based rewards (RLVR), process supervision, reward models, and execution semantics alignment
  • Ship Production AI: Optimize for inference: quantization, distillation, pruning, and serving infrastructure—understanding quality/latency/cost tradeoffs
  • Design evaluation pipelines that catch regressions and measure what actually matters
  • build guardrails and monitoring for production
  • Define integration architecture: APIs, batching, caching, failure handling
  • Lead Technically: Mentor engineers on ML fundamentals, debugging techniques, and the craft of building reliable systems
  • Fulltime
Read More
Arrow Right