CrawlJobs Logo

Lead AI Compiler Engineer

India, Hyderabad · Job Posted March 03, 2026
Apply Position
Job Link Share

Job Description

We are looking for an AI Compiler Engineer to join this high impact team working in the growing field of on-device AI inference acceleration as an individual contributor or as a technical lead in the AI group (AIG). As an AI Compiler Engineer, you will design and optimize AI compiler stack and tools that enable efficient execution of state-of-the-art open source as well as proprietary AI models such as LLMs, transformer models, etc., to AMD NPUs for on-device AI inference use-cases.. You will work on transforming high-level AI models into efficient, low-level code that can run on NPU. Your work will directly impact the performance, efficiency, and scalability of our AI solutions.

Job Responsibility

  • Graph transformation
  • Constant folding
  • Operator fusion: Identify and implement performance optimization opportunities by reducing memory traffic through operator fusion at different memory hierarchy levels e.g., attention block
  • Common subexpression elimination
  • Problem partitioning and dataflow orchestration: Design of algorithms to optimally map given AI operation to the NPU comprising of an interconnected array of AI engines
  • Design and implementation of algorithms to orchestrate dataflow through multi-level memory hierarchy
  • Kernel Design and Development: Design and implement highly optimized C++/intrinsic based kernels for AI related operators
  • Develop vectorized code that leverages SIMD (Single Instruction, Multiple Data) and VLIW (Very Long Instruction Word) for optimal performance
  • Perform performance, program memory and accuracy tradeoffs
  • Testing and Validation: Develop CPU models for the ML operators in C++/ Python to validate accuracy
  • Write unit tests and integration tests to ensure correctness and reliability
  • Performance Profiling and Tuning: Profile and analyze the performance of model layers
  • Identify performance/accuracy bottlenecks and alleviate those
  • Documentation and Collaboration: Effective technical communication of day-to-day work and document design specs
  • Follow good coding practices, using version control system
  • Collaborate with cross-functional teams spanning over AI research, core architecture and software engineering

Requirements

  • Excellent C/C++ and Python coding skills
  • Good understanding of SIMD, VLIW processor architecture
  • Experience with vectorized programming (SIMD)
  • Thorough understanding of fixed and floating point arithmetic
  • Good understanding of various operators in state-of-the-art AI models
  • Knowledge of low-level hardware details (cache hierarchy, DMA programming)
  • Excellent problem-solving skills especially on debug and a passion for on-device AI
  • Prefer candidates with past experience on AI compiler design
  • BS/Masters/PhD degree in Computer Science, Electrical Engineering, or a related field

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Lead AI Compiler Engineer

8 matching positions

Lead AI Researcher

Join the AI Research & Solutions Group. We're looking for an engineer who builds...
Location
Location
United States , Menlo Park
Salary
Salary:
Not provided
outsystems.com Logo
OutSystems
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years building AI that shipped
  • Trained real models: Architecture design through deployment—not fine-tuning tutorials, actual model development
  • Deep ML fundamentals: Backprop, attention, loss functions, optimization—at a level where you can debug them
  • Production engineering: Expert Python, PyTorch fluency, distributed training, GPU programming, ML infrastructure at scale
  • Dataset intuition: You've built datasets and can spot data issues from model behavior—you know data quality beats model size
  • Agent architecture: Built systems that plan, act, and adapt—agentic loops, tool-calling, multi-step reasoning pipelines
Job Responsibility
Job Responsibility
  • Build & Train Models: Own the full pipeline: architecture design → dataset engineering → distributed training → production deployment
  • Implement state-of-the-art techniques from papers—often before official implementations exist—translating math into working PyTorch code
  • Build experimentation infrastructure that enables rapid iteration on architectures, training regimes, and evaluation methods
  • Build Agent Systems: Create orchestration loops that enable AI to reason across multi-file changes, invoke tools (compilers, test runners, debuggers), and recover from failures
  • Implement RL for code generation: execution-based rewards (RLVR), process supervision, reward models, and execution semantics alignment
  • Ship Production AI: Optimize for inference: quantization, distillation, pruning, and serving infrastructure—understanding quality/latency/cost tradeoffs
  • Design evaluation pipelines that catch regressions and measure what actually matters
  • build guardrails and monitoring for production
  • Define integration architecture: APIs, batching, caching, failure handling
  • Lead Technically: Mentor engineers on ML fundamentals, debugging techniques, and the craft of building reliable systems
  • Fulltime
Read More
Arrow Right

Software Engineer - AI SysML (Technical Leadership)

Meta is seeking an AI Software Engineer to join our Research & Development teams...
Location
Location
United States , Sunnyvale
Salary
Salary:
219000.00 - 301000.00 USD / Year
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Vast experience communicating and working across functions to drive solutions
  • Experience in driving large cross-functional and industry-wide engineering efforts
  • Proven track record of planning multi-year roadmap in which shorter-term projects ladder to the long term vision
  • Experience leading projects with industry-wide impact
  • Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience
  • Significant experience in mentoring/influencing engineers across organizations
  • Specialized experience in one or more of the following machine learning/deep learning domains: ML systems: AI infrastructure, machine learning accelerators, high performance computing, machine learning compilers, GPU architecture, machine learning frameworks, on-device optimization
  • Experience developing AI algorithms or AI-System infrastructure in C/C++ or Python
Job Responsibility
Job Responsibility
  • Drive the organization’s goal towards relevant machine learning techniques to build & optimize our intelligent systems that improve Meta’s products and experiences
  • Effectively communicate complex features and systems in detail while advocating for higher product quality and engineering efficiency
  • Assist in goal setting related to project impact
  • Develop custom/novel architectures, define use cases, and develop methodology & benchmarks to evaluate different approaches
  • Apply in depth knowledge of how the machine learning system interacts with the other systems around it
  • Understand industry and Meta wide technology trends in computing technology to help assess & develop new technologies within the ML Systems roadmap
  • Drive the team's goals and technical direction to pursue opportunities that make your larger organization more efficient
  • Partner & collaborate with organizational leaders to help improve the level of performance of the team & organization
What we offer
What we offer
  • bonus
  • equity
  • benefits
Read More
Arrow Right

AI Research Engineer

Domyn is a company specializing in the research and development of Responsible A...
Location
Location
Italy , Milan
Salary
Salary:
50000.00 - 80000.00 EUR / Year
igenius.ai Logo
iGenius
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Artificial Intelligence or a related field, or equivalent practical experience
  • At least 5 years of proven experience as an AI research engineer or more than 2 years of experience and a PhD
  • Expertise in modern machine learning frameworks such as PyTorch, TensorFlow, and JAX, with deep knowledge of distributed training (Pytorch Distributed, Ray, DeepSpeed)
  • Strong background in parallel computing and high-performance systems, including CUDA programming and compiler optimizations
  • Hands-on experience with ML model debugging and performance profiling tools (TensorBoard, Weights & Biases, NVIDIA Nsight)
  • Proficiency in Python and C++ or Rust, particularly for high-performance inference and AI accelerators
  • Solid understanding of mathematics behind deep learning (linear algebra, probability, optimization)
  • Experience deploying models in production, optimizing for latency, throughput, and memory efficiency
  • Fluent in English
Job Responsibility
Job Responsibility
  • Build the next generation of large language models, across the full life cycle: pretraining on trillions of tokens, state-of-the-art post-training, and releases spanning from a few billion to hundreds of billions of parameters
  • Collaborate directly with leading industry labs like NVIDIA
  • Work will ship into mission-critical deployments with some of the largest players in banking, defense, manufacturing, and the public sector
  • Design and optimize the infrastructure that trains and serves these models
  • Contribute to open-source projects shaping the European AI ecosystem
What we offer
What we offer
  • Learning Friday
  • Training budget for books, online courses or other training materials
  • Smart Working (option to work from home)
  • Salary topped up with other bonuses
  • Opportunity to receive company equity
  • Stock options
  • Fulltime
Read More
Arrow Right

Principal Ai Software Engineer

WHAT YOU DO AT AMD CHANGES EVERYTHING At AMD, our mission is to build great prod...
Location
Location
United States , San Jose
Salary
Salary:
240000.00 - 360000.00 USD / Year
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Knowledge in GPU architectures, basic knowledge of CPU architecture
  • Experience in AI/ML software stack spanning compilers, kernels, runtime, libraries, models, frameworks, and performance optimization layers
  • Understanding of GPU programming such as ROCm, CUDA, OpenCL, etc
  • Experience in hardware/software co-design, building high-performance products across the full product lifecycle
  • Experience with operating systems (OS) and device driver development is a plus
  • Undergrad degree required. Bachelor of Science, Masters, or PhD degree with emphasis in Electrical Engineering, Computer architecture, or Computer Science with relevant experience preferred
Job Responsibility
Job Responsibility
  • Hardware-Software Co-design: Collaborate across hardware architecture, compiler, math libraries, kernel and framework teams to influence future silicon features based on evolving AI workload trends
  • Strong Execution: Deliver innovations and roadmap for AI software stack across all AMD products, ensuring AMD remains the platform of choice for top-tier AI customers
  • Workload Performance Engineering: Lead the profiling, analysis, and tuning of large-scale models (LLMs, Diffusion, Multimodal, and MoE) to ensure out-of-the-box performance excellence on AMD hardware
  • Ecosystem Innovation: Drive the development of advanced tools and frameworks for performance estimation, modeling, and automated reporting
  • Customer Engagement: Partner with top customers and hyperscalers to understand their unique workload requirements and deliver tailored architectural wins and software optimizations
  • Community & Open Source: Mentor and inspire other engineers and contribute to ROCm Opensource
What we offer
What we offer
  • AMD benefits at a glance
  • Fulltime
Read More
Arrow Right

Principal AI Software Engineer

AMD AI Group is seeking a highly influential technical leader for OneROCm — driv...
Location
Location
United States , San Jose
Salary
Salary:
Not provided
amd.com Logo
AMD
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Knowledge in GPU architectures, basic knowledge of CPU architecture
  • Experience in AI/ML software stack spanning compilers, kernels, runtime, libraries, models, frameworks, and performance optimization layers
  • Understanding of GPU programming such as ROCm, CUDA, OpenCL, etc
  • Experience in hardware/software co-design, building high-performance products across the full product lifecycle
  • Experience with operating systems (OS) and device driver development is a plus
  • Undergrad degree required. Bachelor of Science, Masters, or PhD degree with emphasis in Electrical Engineering, Computer architecture, or Computer Science with relevant experience preferred
Job Responsibility
Job Responsibility
  • Hardware-Software Co-design: Collaborate across hardware architecture, compiler, math libraries, kernel and framework teams to influence future silicon features based on evolving AI workload trends
  • Strong Execution: Deliver innovations and roadmap for AI software stack across all AMD products, ensuring AMD remains the platform of choice for top-tier AI customers
  • Workload Performance Engineering: Lead the profiling, analysis, and tuning of large-scale models (LLMs, Diffusion, Multimodal, and MoE) to ensure out-of-the-box performance excellence on AMD hardware
  • Ecosystem Innovation: Drive the development of advanced tools and frameworks for performance estimation, modeling, and automated reporting
  • Customer Engagement: Partner with top customers and hyperscalers to understand their unique workload requirements and deliver tailored architectural wins and software optimizations
  • Community & Open Source: Mentor and inspire other engineers and contribute to ROCm Opensource
What we offer
What we offer
  • Benefits offered are described: AMD benefits at a glance
  • Fulltime
Read More
Arrow Right

AI Performance Engineer

As a member of the Computing Product Line, Heterogeneous Memory Software Lab, yo...
Location
Location
Poland , Warszawa
Salary
Salary:
40000.00 - 50000.00 PLN / Month
devire.pl Logo
Devire
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Deep understanding of GPU or NPU architecture, including execution units, memory hierarchy, interconnects, and thread scheduling, as well as performance bottleneck analysis methodologies
  • Familiarity with mainstream deep learning frameworks such as PyTorch, TensorFlow, or JAX
  • Hands-on experience in deep learning operator/kernel development and performance tuning, with the ability to implement and optimize complex operators
  • Proficiency with performance analysis and profiling tools (e.g., Nsight Compute, nvprof, torch.profiler), and ability to conduct quantitative analysis and performance modeling
  • Strong system design and software engineering skills, with the ability to balance performance, maintainability, and generality in complex systems
  • Master’s or Ph.D. degree in Computer Architecture, Compiler Design, High Performance Computing, or a related field
Job Responsibility
Job Responsibility
  • Lead performance optimization of AI models on Ascend NPUs, including performance analysis, bottleneck identification, and optimization implementation for both training and inference workloads
  • Analyze performance bottlenecks of multimodal models and large language models (LLMs) on the Ascend platform, covering operators, kernels, memory access patterns, and scheduling
  • design and implement optimization strategies
  • Develop and optimize critical operators/kernels, continuously improving execution efficiency, memory access patterns, parallelization strategies, and hardware resource utilization
  • Research and apply advanced techniques such as auto-tuning, operator fusion, graph optimization, and scheduling optimization in real-world production scenarios
  • Build and lead an NPU performance optimization team
  • communicate findings to cross-functional teams and leadership, and contribute to the evolution of next-generation Ascend NPU architecture
What we offer
What we offer
  • Private healthcare package
  • Sport Cards
  • Benefit Platform
  • Special discounts for employees
  • Office massages
  • annual bonus
  • Fulltime
Read More
Arrow Right

Technologist lead full chip soc sign-off emir closure engineer

The candidate to have a passion for complex processor architecture, digital desi...
Location
Location
India , Bengaluru
Salary
Salary:
Not provided
sandisk.com Logo
Sandisk
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Minimum 8 to 12 years of relevant work experience
  • Expertise in Synthesis, ICC2/FC (Fusion Compiler) Physical Design flows/methodologies or equivalent tools
  • Expertise in Signoff tools like Ansys Redhawk/RHSC on EMIR, PT-PX for Power signoff and Primetime for Timing
  • Should have worked as a go to person or technical lead for at least few full chip projects
  • Strong technical leadership and ability to mentor/guide/coach design engineers
  • Strong inter-personal skills and ability to collaborate with teams spread across multiple geos
  • Should have good scripting experience in Shell, Python, Perl, TCL, UNIX
  • Must be using AI technologies in day to day problem solving
  • Bachelors or Masters degree in Electronics and communication engineering/Electronics and Electrical Engineering
Job Responsibility
Job Responsibility
  • EMIR closure at Tile /Partition/Subsytem/SOC level on a particular node
  • Ownership of EMIR Flow & Signoff
  • Drive EMIR closure for design blocks and full-chip integration
  • Align EMIR analysis phases with PD milestones (INIT, PREFINAL, FINAL)
  • Ensure compliance with signoff limits provided by SIPI team for given technology node
  • Ability to analyze EMIR violations and propose implementation choices
  • Familiarity with DFx and scan architecture for EMIR considerations
  • Define EMIR signoff strategy, including IR drop and electromigration analysis
  • Review floorplan, clocking, and bus structures for IR hotspots
  • Coordinate bump allocation and power rail alignment with integration teams
  • Fulltime
Read More
Arrow Right
New

Senior Gameplay Programmer

Location
Location
United Kingdom , Nottingham
Salary
Salary:
Not provided
jobs.360resourcing.co.uk Logo
360 Resourcing Solutions
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong knowledge of C++ with an emphasis on maintainable, reusable, and well-documented code. Familiarity with modern C++ features and patterns (e.g., RAII) is a bonus
  • Unreal Engine 5 Knowledge, including: Best practices for blueprints, UObjects, and delegate usage
  • Experience with navigation systems (e.g., recast/detour, navlinks)
  • Proficiency in state trees, state machines, EQS, and character movement/physics
  • Familiarity with software design patterns such as state machines, hashmaps, and entity-component systems
  • Experience with performance analysis and subsequent optimisation
  • A positive attitude towards collaboration and the skills of others
  • Experience in giving and receiving constructive feedback
  • Proficiency with source control systems
  • Perforce experience is ideal
Job Responsibility
Job Responsibility
  • Design and maintain efficient systems for gameplay AI, animation, character physics, and character spawning in UE5
  • Translate gameplay requirements into technical solutions, using C++ and UE5’s blueprint systems for data-driven development
  • Write well-optimised, thoroughly documented code that meets established coding standards
  • Take full or shared ownership of features and systems, from ideation and planning through to support and bug fixing
  • Serve as a point of contact for your area of expertise
  • Conduct performance analysis and optimise code for multi-platform stability and efficiency
  • Build strong relationships with team members and cross-departmental colleagues to ensure seamless cooperation
  • Follow best practices for source control (Perforce preferred), ensuring build stability and multi-platform compilation success
  • Review code changes with a focus on quality, while providing and receiving constructive feedback in a collaborative and respectful manner
  • Lead technical discussions and aligns engineers and stakeholders on approach
What we offer
What we offer
  • Core hours 9.30am – 4pm, remaining hours worked flexibly
  • Relocation support to Nottingham, UK (if required)
  • Holiday allowance that increases with service (to a maximum of 30 days plus statutory public holidays)
  • Annual pay reviews
  • Company pension contribution that increases with service
  • Company enhanced full pay for maternity leave for the first 26 weeks (to qualifying expectant mothers)
  • Clear career progression within Dambuster Studios
  • Studio funded learning and development opportunities
  • Modern game development environment with the latest technologies
  • Vibrant, modern city centre location with good transport links
  • Fulltime
Read More
Arrow Right