This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Scale is hiring ML Research Engineers to bridge the gap between frontier research and real-world impact. While we solve critical challenges for global governments, your role will extend beyond implementation. You will lead the charge in research into Agent design, Deep Research and AI Safety/reliability, developing novel methodologies that not only power public sector applications but set new standards across the entire Scale organisation. Your mission is threefold: Frontier Research & Publication: Leading research into LLM/agent capabilities, reasoning, and safety, with the goal of publishing at top-tier venues (NeurIPS, ICML, ICLR). Cross-Org Impact: Developing generalised techniques in Agent design, AI Safety and Deep Research agents that scale across our commercial and government platforms. Mission-Critical Applications: Engineering high-stakes AI systems that impact millions of citizens globally.
Job Responsibility:
Pioneer Novel Architectures: Design and train state-of-the-art models and agents, moving beyond “off-the-shelf” solutions to create custom architectures for complex public sector reasoning tasks
Lead AI Safety Initiatives: Research and implement robust safety frameworks, including red teaming, alignment (RLHF/DPO), and bias mitigation strategies essential for sovereign AI
Drive Deep Research Capabilities: Develop agents capable of long-horizon reasoning and autonomous information synthesis to solve complex problems for national security and public policy
Publish and Contribute: Represent Scale in the broader research community by publishing high-impact papers and contributing to open-source breakthroughs
Consult as a Subject Matter Expert: Act as a technical authority for public sector leaders, advising on the theoretical limits and safety requirements of emerging AI
Build Evaluation Frontiers: Create new benchmarks and evaluation protocols that define what success looks like for high-stakes, non-commercial AI applications
Requirements:
Advanced Degree: PhD or Master’s in Computer Science, Mathematics, or a related field with a focus on Deep Learning
Research Track Record: A portfolio of first-author publications at major conferences (NeurIPS, ICML, CVPR, EMNLP, etc.)
Engineering Rigour: Strong proficiency in Python, deep learning frameworks (PyTorch/JAX), with the ability to write production-ready code that scales
Safety Expertise: Experience in alignment, robustness, or interpretability research
Nice to have:
Experience with large-scale distributed training on massive clusters
Experience in building agentic systems that are reliable
Experience in Sovereign AI or working with highly regulated data environments
A zero-to-one mindset: Comfortable navigating ambiguity and defining research directions from scratch