This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
We are seeking a senior performance devtools engineer with deep, hands‑on experience analyzing and optimizing complex systems across Linux, Windows, and FreeBSD. This role focuses on solving real performance problems end‑to‑end and building the tooling required to do so effectively.
Job Responsibility:
Perform end‑to‑end performance analysis of complex workloads across Linux, Windows, and FreeBSD
Identify and resolve CPU, memory, cache, NUMA, scheduler, I/O, concurrency, and power/thermal bottlenecks
Select and apply different performance analysis workflows based on problem class (latency, throughput, power, scalability)
Design, develop, and enhance performance analysis and tuning toolchains used by engineers and power users
Write production‑quality C++ code to implement analysis tools, optimizations, experiments, and fixes
Build and integrate AI agents to automate performance analysis, data triage, regression detection, and reporting
Design and operate MCP servers (or equivalent agent backends) to connect models, tools, profilers, and data sources securely
Drive power and performance optimizations at system, OS, and application layers
Validate improvements using reproducible methodology, benchmarks, and data‑driven reporting
Collaborate with architecture, system software, tools, and hardware teams to influence design decisions
Author technical papers, whitepapers, blogs, or conference publications
Contribute tooling, fixes, or analysis back to open‑source communities
Requirements:
Proven hands‑on experience performing performance analysis on complex systems
Deep knowledge of CPU micro‑architecture (pipelines, caches, SMT, NUMA, memory hierarchy, power states)
Expertise analyzing performance across Linux, Windows, and FreeBSD
Demonstrated experience in power and performance optimization
Strong, hands‑on C++ programming skills (ability to build tools, land fixes, and maintain production code)
Experience developing or enhancing performance tuning toolchains
Strong ability to correlate hardware behavior, OS behavior, and application behavior
Clear understanding of how to approach and solve different classes of performance problems, including: CPU‑bound vs memory‑bound, Latency‑sensitive vs throughput‑driven workloads, Lock contention, scheduling, and concurrency bottlenecks, Power‑limited, thermal‑limited, and frequency‑scaling scenarios
Ability to define repeatable, explainable workflows for diagnosis → hypothesis → measurement → fix → validation
Experience writing and using AI agents for performance analysis, automation, or developer productivity
Ability to integrate AI responsibly into performance workflows (analysis, triage, optimization, reporting)
Experience using AI to scale performance expertise, not just accelerate coding
Meaningful open‑source contributions (code, tooling, performance investigations, or documentation)
Authorship of multiple technical publications, such as: Conference papers or presentations, Technical blogs or whitepapers, Open‑source design or performance analysis documents
Nice to have:
Experience with hardware performance counters and low‑level profiling mechanisms
Background in HPC, cloud infrastructure, or large‑scale production systems
Experience influencing product or architecture decisions using performance data