CrawlJobs Logo

Senior Researcher - Efficient AI

India, Bangalore · Job Posted March 01, 2026
Apply Position
Job Link Share

Job Description

Generative AI is transforming how people create, collaborate, and communicate—redefining productivity across Microsoft 365 for customers worldwide. At Microsoft, we operate one of the largest collaboration and productivity platforms in the world, serving hundreds of millions of consumer and enterprise users. Delivering these AI experiences at scale requires solving some of the hardest efficiency challenges in modern AI systems. We are an applied research team focused on advancing efficiency across the AI stack, spanning models, ML frameworks, cloud infrastructure, and hardware. We drive mid- and long-term product innovation through close collaboration with research and product teams across the company. We communicate our research both internally and externally through internal technical reports, academic conference publications, open-source releases, and patents. Beyond producing research, we take responsibility for driving ideas through prototyping, validation, and production, with a strong bias toward real-world impact. The ideal Senior Researcher candidate will work across the full stack—from large-scale serving systems to hardware- and kernel-level optimizations—exploring algorithmic, systems, and hardware/software co-design techniques. Areas of focus include batching, routing, scheduling, caching, endpoint configuration, and GPU architecture–aware optimizations. This role emphasizes end-to-end ownership, with responsibility for identifying high-impact problems and driving research ideas through prototyping, validation, and deployment to deliver measurable customer impact.

Job Responsibility

  • Formulate, develop, and evaluate new algorithmic and system-level approaches for end-to-end AI serving, using analytical modeling and large-scale measurement to study token-level latency, tail latency (p95/p99), throughput-per-dollar, cold-start behavior, warm pool strategies, and capacity planning under multi-tenant SLOs and variable sequence lengths
  • Design and experimentally evaluate endpoint configuration and execution policies, including batching, routing, and scheduling strategies, tensor and pipeline parallelism, quantization and precision profiles, speculative decoding, and chunked or streaming generation, and drive the most promising approaches through robust rollout and validation into production
  • Perform hardware- and kernel-aware optimization by collaborating closely with model, kernel, compiler, and hardware teams to align serving algorithms with attention/KV innovations and accelerator capabilities
  • Build and benchmark experimental prototypes and large-scale measurements to validate research ideas and drive them toward production readiness
  • produce clear technical documentation, design reviews, and operational playbooks
  • Publish research results, file patents, and, where appropriate, contribute to open-source systems and serving frameworks

Requirements

  • Doctorate in relevant field
  • OR Master's Degree in relevant field AND 3+ years related research experience
  • OR Bachelor's Degree in relevant field AND 4+ years related research experience
  • OR equivalent experience
  • Demonstrated expertise in areas of algorithmic optimization, parallel computing, queuing and scheduling theory, and practical request orchestration under strict SLO constraints
  • Strong understanding of GPU architecture and memory hierarchies
  • Proficiency in C++ and Python for high-performance systems, with strong code quality and profiling/debugging skills
  • Proven record of research impact through publications and/or patents, and experience carrying ideas through to systems that operate at scale in real production environments
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • This position will be required to pass the Microsoft Cloud background check upon hire/transfer and every two years thereafter

Nice to have

  • Deep understanding of transformer inference efficiency techniques such as sharding strategies, attention optimizations, paged KV caches, speculative decoding, LoRA, sequence packing or continuous batching, and quantization
  • 3+ years of experience with machine learning frameworks (e.g., PyTorch, TensorFlow) and inference serving frameworks (e.g., vLLM, Triton Inference Server, TensorRT-LLM, ONNX Runtime, Ray Serve, DeepSpeed-MII)
  • 3+ years of experience in GPU programming and optimization, with expert knowledge of CUDA, ROCm, Triton, PTX, CUTLASS, or similar GPU programming frameworks
  • Background in cost and performance modeling, autoscaling, and multi-region deployment or disaster recovery

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Senior Researcher - Efficient AI

8 matching positions

Senior Researcher - Efficient AI

Generative AI is transforming how people create, collaborate, and communicate—re...
Location
Location
United States , Redmond
Salary
Salary:
119800.00 - 234700.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in relevant field OR Master's Degree in relevant field AND 3+ years related research experience OR Bachelor's Degree in relevant field AND 4+ years related research experience OR equivalent experience
  • Ability to meet Microsoft, customer and/or government security screening requirements are required for this role
  • Demonstrated experience in designing and optimizing efficient inference systems, combining foundations in algorithmic optimization, parallel computing, and request orchestration under strict SLO constraints with deep knowledge of attention and KV‑cache optimizations, batching and scheduling strategies, and cost‑aware deployment
  • 3+ years of experience with machine learning frameworks (e.g., PyTorch, TensorFlow) and inference serving frameworks (e.g., vLLM, Triton Inference Server, TensorRT-LLM, ONNX Runtime, Ray Serve, DeepSpeed-MII)
  • 3+ years of experience in GPU programming and optimization, with expert knowledge of CUDA, ROCm, Triton, PTX, CUTLASS, or similar GPU programming frameworks
  • Proficiency in C++ and Python for high-performance systems, with code quality and profiling/debugging skills
  • Research impact through publications and/or patents, coupled with hands‑on experience taking research ideas through execution and delivery in production
Job Responsibility
Job Responsibility
  • Formulate, develop, and evaluate new algorithmic and system-level approaches for end-to-end AI serving, using analytical modeling and large-scale measurement to study token-level latency, tail latency (p95/p99), throughput-per-dollar, cold-start behavior, warm pool strategies, and capacity planning under multi-tenant SLOs and variable sequence lengths
  • Design and experimentally evaluate endpoint configuration and execution policies, including batching, routing, and scheduling strategies, tensor and pipeline parallelism, quantization and precision profiles, speculative decoding, and chunked or streaming generation, and drive the most promising approaches through robust rollout and validation into production
  • Perform hardware- and kernel-aware optimization by collaborating closely with model, kernel, compiler, and hardware teams to align serving algorithms with attention/KV innovations and accelerator capabilities
  • Build and benchmark experimental prototypes and large-scale measurements to validate research ideas and drive them toward production readiness
  • produce clear technical documentation, design reviews, and operational playbooks
  • Publish research results, file patents, and, where appropriate, contribute to open-source systems and serving frameworks
  • Fulltime
Read More
Arrow Right

Senior AI Researcher

Our data science team is seeking a highly skilled AI researcher with expertise i...
Location
Location
Israel , Tel-Aviv
Salary
Salary:
Not provided
khealth.com Logo
K Health
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 5+ years of experience in AI research, with a strong background in LLM research
  • Advanced proficiency in Python and hands-on experience with leading frameworks and libraries, such as PyTorch, Hugging Face, torchtune, and vLLM
  • M.Sc. or Ph.D. in Computer Science, Data Science, Statistics, Engineering, Mathematics, Physics, or a related field. Preferably with applied focus on machine learning, computer vision, NLP, or deep learning
  • Fast learner with excellent problem-solving skills
  • Positive attitude, intellectual curiosity, and eagerness to learn and share knowledge
  • Passion for medicine, health, and wellbeing, with a drive to make a meaningful impact
  • Proven experience translating deep learning research into reliable, high-impact production systems
  • Collaborative team member who contributes to shared technical decisions and knowledge sharing
Job Responsibility
Job Responsibility
  • Develop clinical AI models to support K’s clinic workflows, with a focus on patient-facing applications
  • Design and implement scalable, efficient pipelines for data preprocessing, information extraction, and model training
  • Stay up to date with the latest AI research and best practices, and translate them into production-ready solutions
  • Communicate complex technical concepts and solutions to non-technical stakeholders
  • Collaborate with MLE and AI engineering teams to deploy scalable, robust solutions
What we offer
What we offer
  • Competitive compensation packages based on industry benchmarks for function, level, and geographic location
Read More
Arrow Right

Senior Researcher - Cloud and AI Infrastructure

Microsoft Research Asia – Vancouver lab, located in the vibrant city of Vancouve...
Location
Location
Canada , Vancouver
Salary
Salary:
114400.00 - 203900.00 CAD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in relevant field OR equivalent experience
  • Experience publishing academic papers as a lead author or essential contributor
  • Experience participating in a top conference in relevant research domain
  • Experience in optimizing or designing hardware components and architectures to enhance performance, reliability, efficiency
Job Responsibility
Job Responsibility
  • Investigate and analyze emerging hardware technologies, trends, and advancements
  • Design and optimize hardware components, systems, and architectures to enhance performance, reliability, and efficiency
  • Conduct simulations, tests, and validations to ensure hardware designs meet required specifications and performance goals
  • Develop prototypes and proof-of-concept models to demonstrate new hardware technologies and applications
  • Identify opportunities for hardware improvements and cost reductions by staying informed about industry best practices and standards
  • Collaborate with cross-functional teams, including software researchers, designers, and engineers, to identify hardware requirements and develop innovative solutions
  • Partner with manufacturing vendors and production teams to transition innovative designs and concepts into deployable systems
  • Document research findings, design decisions, and technical specifications to facilitate knowledge sharing and collaboration within the organization
  • Fulltime
Read More
Arrow Right

Senior Researcher - Machine Learning: AI for Science

Microsoft Research AI for Science is seeking a talented machine learning researc...
Location
Location
United Kingdom , Cambridge
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in computer science, machine learning, computational materials science or related area, or comparable industry experience
  • Track record of publications at top-tier conferences or journals (e.g., NeurIPS, ICML, ICLR, Nature/Science or relevant sub-journals)
  • Strong coding ability and proficiency in collaborative code development
  • Ability to quickly iterate between ideation, implementation and evaluation of new research ideas
  • Ability to work in an interdisciplinary collaborative environment, through effective communication of technical concepts to non-experts from different technical backgrounds
Job Responsibility
Job Responsibility
  • Contribute to and drive an ambitious, high-impact, research agenda on machine learning for materials
  • Develop efficient and expressive machine learning models that address fundamental materials science problems
  • Work with domain experts to develop realistic machine learning metrics and benchmarks
  • Prepare technical papers and presentations
  • Fulltime
Read More
Arrow Right

Senior Principal Researcher - Cloud and AI Infrastructure

Microsoft Research Asia – Vancouver lab, located in the vibrant city of Vancouve...
Location
Location
Canada , Vancouver
Salary
Salary:
163000.00 - 296400.00 USD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in relevant field AND 6+ years related research experience
  • OR Master's Degree in relevant field AND 7+ years related research experience
  • OR Bachelor's Degree in relevant field AND 9+ years related research experience
  • OR equivalent experience
  • 3+ years’ experience in research related to infrastructure design, computer architecture, or artificial intelligence
  • Experience publishing academic papers as a lead author or essential contributor
  • Experience participating in a top conference in relevant research domain
  • Experience in optimizing or designing hardware components and architectures to enhance performance, reliability, efficiency
Job Responsibility
Job Responsibility
  • Investigate and analyze emerging hardware technologies, trends, and advancements
  • Design and optimize hardware components, systems, and architectures to enhance performance, reliability, and efficiency
  • Conduct simulations, tests, and validations to ensure hardware designs meet required specifications and performance goals
  • Develop prototypes and proof-of-concept models to demonstrate new hardware technologies and applications
  • Identify opportunities for hardware improvements and cost reductions by staying informed about industry best practices and standards
  • Collaborate with cross-functional teams, including software researchers, designers, and engineers, to identify hardware requirements and develop innovative solutions
  • Partner with manufacturing vendors and production teams to transition innovative designs and concepts into deployable systems
  • Document research findings, design decisions, and technical specifications to facilitate knowledge sharing and collaboration within the organization
  • Fulltime
Read More
Arrow Right
New

Senior AI Research Engineer

Adyen is building a top-tier AI engineering organization in Amsterdam, San Franc...
Location
Location
Netherlands , Amsterdam
Salary
Salary:
Not provided
adyen.com Logo
Adyen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • You are deeply embedded in the scientific AI research community and have a strong understanding of the latest SOTA advancements
  • You have significant experience and a strong understanding of Generative AI (GenAI) and Large Language Models (LLMs)
  • You demonstrate a strong engineering mindset with a track record of writing clean, efficient, and scalable code suitable for production environments
  • You have demonstrated experience taking cutting-edge AI research papers and implementing them into production-quality code
  • You demonstrate the ability to think critically and deliver simple and elegant solutions to complex, cross-team problems, influencing strategic direction and fostering innovation across the organization
  • You excel at translating complex technical concepts into clear, understandable terms for diverse audiences, including engineers, executives, and during public events
  • You thrive in leveraging empathy, influence, negotiation, relationship building, and conflict resolution to foster strong, trust-based collaborations
Job Responsibility
Job Responsibility
  • Innovate and Deploy: Drive the execution of Adyen's AI strategy, focusing on the practical application of Generative AI (GenAI) and other AI methodologies in finance
  • Build Production-grade Applications: Bridge the gap between cutting-edge AI research and production by implementing research papers into robust, scalable, and production-ready code
  • Optimize and Scale: Contribute to defining the long-term vision for AI at Adyen, specifically how AI will interact with humans and finance
  • Think Outside the Box: Drive innovation by challenging the status quo, introducing transformative ideas and implementing creative solutions
  • Force Multiplier: Provide mentorship and horizontal sponsorship across the organization
  • Team Player: Actively pair with other engineering teams to solve deep-rooted technical challenges
  • Learn and Lead: Connect with the broader AI community to stay informed of the latest advancements and identify potential partnership opportunities
  • Fulltime
Read More
Arrow Right

Senior AI Engineer

As a Senior AI Engineer focused on agentic framework, you will focus on building...
Location
Location
Denmark , København
Salary
Salary:
Not provided
life-science-talent-solutions.dk Logo
Life Science Talent
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong programming skills in Python and the ability to contribute to production-grade codebases
  • Hands-on experience in LLMs, including at least some of the following: Training, finetuning, or post-training transformer-based models
  • Building or operating LLM inference services in production, including performance work
  • Experience with embeddings, vector databases, and semantic search
  • Practical experience implementing RAG architectures
  • Designing robust evaluations for agent workflows and generative systems, including metrics, error analysis, and human evaluation methods
  • Experience building production-grade ML systems that can be deployed and operated, including pipelines, CI and CD practices, and monitoring
  • Strong product mindset with the ability to translate ideas into working systems
  • Clear communication and collaboration skills across research, engineering, and product
  • A Master’s degree in computer science, engineering, mathematics, statistics, physics, or a related field, or equivalent professional experience
Job Responsibility
Job Responsibility
  • Design and build LLM-powered product features used in production
  • Develop agentic workflows and frameworks that coordinate multiple AI components
  • Implement RAG architectures using embeddings and vector search
  • Build systems for prompting, context engineering, and tool usage
  • Develop evaluation frameworks to measure LLM and agent performance
  • Work closely with product and platform teams to turn AI capabilities into reliable, scalable product features
  • Continuously improve system reliability, latency, and cost efficiency of AI pipelines
What we offer
What we offer
  • Equipment provided by Corti
  • Fulltime
Read More
Arrow Right

Senior Researcher - Systems and Networking

Microsoft Research Asia – Vancouver lab, located in the vibrant city of Vancouve...
Location
Location
Canada , Vancouver
Salary
Salary:
114400.00 - 203900.00 CAD / Year
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate in computer science or relevant field OR equivalent experience
  • Doctorate in relevant field AND 2+ years of related research experience OR equivalent experience
  • Experience publishing academic papers as a lead author or essential contributor
  • Experience participating in a top conference in relevant research domain
  • Background in systems and networking, including experience with machine learning system, database, and networking technologies
  • Deep knowledge about the latest technical advancements such as Agent Systems, Vector Database, and ML Systems
  • A track record of published research in the field of AI-driven system innovation is a plus
Job Responsibility
Job Responsibility
  • Conduct research on state-of-the-art AI-driven system methods and technologies to identify opportunities for system innovation and acceleration
  • Develop and implement new methodologies, techniques, and algorithms for improving the performance, efficiency, and scalability of AI-driven systems
  • Collaborate with cross-functional teams, including hardware and software engineers, data scientists, and product managers, to drive the development and deployment of innovative AI-driven system solutions
  • Stay current with the latest trends, research, and developments in AI, machine learning, and system architecture to ensure our systems remain at the forefront of innovation
  • Evaluate the performance of AI-driven systems and provide recommendations for improvement and optimization
  • Publish research findings in peer-reviewed journals, conferences, and other relevant venues, and present research results to internal and external stakeholders
  • Mentor and guide other researchers and engineers in their research and development efforts
  • Collaborate with industry partners and academic institutions to drive joint research projects and initiatives
  • Fulltime
Read More
Arrow Right