This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
As a member of our Platform Development team, you will be instrumental in building and optimizing high-performance trading systems, research compute clusters, databases, support systems, and more. You will heavily utilize Linux and Windows internals while working on servers in our HPC environment.
Job Responsibility:
Contribute to our library of home-grown tools, written primarily in Python and Bash, to automate monitoring, and maintenance
Work closely with Strategy Developers, Quantitative Researchers, and trade-supporting application teams to translate complex problems into scalable solutions
Coordinate with IT infrastructure teams, including storage and networking, to identify and implement the best solutions
Tune operating systems and batch workflows for performance
Dive deep on root-cause analysis of systems issues
Integrate all of these solutions into our systems effectively and efficiently
Oversee all aspects of our HPC environment, including the scheduler, parallel filesystems, GPUs, and interconnects
Implement and optimize high-performance storage solutions, including Lustre, VAST, and GPFS
Develop strategies to ensure optimal resource allocation and scalability
Utilize monitoring and diagnostic tools to quickly pinpoint failures, streamline troubleshooting processes, and ensure the timely recovery of disrupted workflows
Requirements:
A Bachelor’s degree in Engineering, Computer Science, Information Systems, or a related discipline
5-7 years of progressive experience building Linux and/or Windows based HPC based platforms
Familiarity with kernel-level and I/O subsystem tweaks and tools such as sysctl, strace, tcpdump, and netstat
Recent hands-on experience with automation in Python or other tools
Experience administering Lustre, GPFS, VAST, or other parallel filesystems
Understanding of resource schedulers like HTCondor, SLURM, or similar
Nice to have:
Bonus points for equivalent Windows knowledge (registry, procmon, wireshark, tshark)