This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
You will work within the Applications and Performance team (more than 30 Engineers) whose main mission is to respond to HPC & AI calls for tenders. This team plans and guarantees the performance of customers' scientific applications on proposed supercomputers. Your role will be the preparation, execution, and analysis of AI applications, mainly benchmarks. These benchmarks generally consist of trainings or inferences on a reference data set and rules, to evaluate the performance of the system in terms of time or throughput. Usual benchmarks are taken from the MLPerf suite (Training and Inference: Datacenter). The purpose of the benchmarking activity is to characterize the application on current system to project on target systems: latency vs throughput, accuracy vs performance, scaling efficiency, … Target systems are computing clusters usually equipped with accelerators (e.g. NVIDIA GPU, AMD GPU, Intel Gaudi GPU, …). Therefore, the benchmarks are run in a multi-node multi-accelerator framework. Part of the work consists of estimating the performance of non-existent systems (new technology, larger size, etc.).
Job Responsibility:
AI benchmark analysis: Literature review on the application considered
Code exploration (if available)
Match between hardware architecture and hyperparameters
Benchmark preparation: Software environment (usually in a container)
Training and job scripts
Dataset preparation
Hyperparameter search
Documentation
Benchmark execution and performance estimation: Test executions
Analysis
Reports
Requirements:
Relevant degree in higher education or university
Ideally, several years of experience in the field of AI
Autonomous with team spirit
Good understanding of the fundamentals of deep learning
Experience in training classic neural network architectures (MLP, CNN, RNN)