This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are looking for a dynamic, upbeat software engineer to join our growing team. Your work will focus on building robust, efficient software components that enable high-performance execution of large language models and multimodal models across multi-GPU systems. You’ll collaborate with internal GPU library teams and open-source maintainers to implement features that improve throughput, latency, and scalability. This role emphasizes full-stack development within AI inference systems, with a strong focus on model behavior and framework integration.
Job Responsibility:
Deep Learning & LLM Framework Optimization
Model-Aware Implementation
Performance-Conscious Coding
Profiling
End-to-End Performance Engineering
Compiler & Pipeline Acceleration
Research & Advanced Techniques
Cross-Team & Open-Source Collaboration
Software Engineering Excellence
Requirements:
Familiarity in Python
Understanding of LLM or multimodal model concepts
Linux development environment
End-to-End LLM Performance Engineering
Bachelor's in Computer Science, Computer Engineering, Electrical Engineering, or a related field
Nice to have:
Familiarity with C++ or async programming
Software Engineering Excellence & Community Contribution