This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As part of the Microsoft Azure AI Knowledge group, the team builds Document Intelligence capabilities that semantically structure documents for intelligent processing across traditional scenarios (RPA, search indexing, compliance, security) and modern LLM-based applications (RAG, agent memory). The team is on a mission to empower people through AI in everyday document tasks. We are looking for a Senior Software Engineer to be a technical expert in our core engine team. In this role, you will bridge the gap between AI models and production infrastructure, taking ownership of complex architectural decisions. You will drive initiatives to optimize AI model inference, architect highly efficient infrastructure, and design scalable features to extend our document understanding capabilities. This is a high-impact role designed for an experienced engineer who thrives on solving ambiguous technical challenges. You will work alongside researchers and service team to turn state-of-the-art prototypes into robust, high-performance, enterprise-grade solutions used by customers worldwide, while also acting as a mentor to elevate the engineering bar of the team.
Job Responsibility:
Runtime Architecture & Development: Lead the design and implementation of critical, high-quality code in C++, C#, and Python. You will design and implement the core inference for our exceptional OCR and document layout analysis engine. You will also design the technical strategy for integrating Microsoft built-in and open-source solutions to support a broader range of formats (Word, Excel, PowerPoint), ensuring the system is extensible and maintainable
Inference Optimization Strategy: Spearhead efforts to optimize deep learning model inference for maximum speed and throughput. You will define performance benchmarks and drive low-level optimizations, utilizing hardware accelerators to ensure our models run efficiently at massive scale
System Infrastructure & Scalability: Architect and oversee the pipeline design for high-scaling AI services. You will drive best practices in containerization and deployment (Docker, Kubernetes), ensuring the system is not only functional but resilient, observable, and cost-effective
Technical Leadership & Mentorship: Act as a technical role model within the team. You will lead code reviews, drive architectural discussions across teams. You will be responsible for upholding engineering excellence and debugging the most complex, systemic issues that span across the stack
Requirements:
Master’s degree in Computer Science or a related field (or equivalent practical experience)
5+ years of professional software engineering experience with a track record of delivering complex, high-impact systems
Expert-level proficiency in at least one of the following languages: C++, C#, or Python, with the ability to architect solutions across a polyglot environment
Deep understanding of Computer Science fundamentals, including advanced data structures, algorithms, and distributed system design
System Architecture: Proven ability to design systems that are scalable, reliable, and maintainable, capable of handling ambiguity and trade-off decisions
Nice to have:
Proven experience in high-performance model inference optimization (e.g., CUDA, TensorRT, ONNX Runtime) in a production environment
Deep expertise with containerization and orchestration technologies, specifically Docker and Kubernetes, at an enterprise scale
Solid understanding of Machine Learning concepts and hands-on experience integrating ML frameworks (e.g., PyTorch, TensorFlow) into production pipelines
Experience in low-level code optimization for latency and memory management
Experience processing complex document formats (PDF internals, Office Open XML) is a strong plus