This list contains only the countries for which job offers have been published in the selected language (e.g., in the French version, only job offers written in French are displayed, and in the English version, only those in English).
Scale AI is the data foundation for AI, helping organizations build and deploy reliable production AI applications. The General Agents team, part of Scale’s Enterprise organization, builds robust general agents for customer use cases and applications. The team sits at the intersection of frontier agent development and real-world deployment, translating state-of-the-art reasoning and agentic capabilities into reliable, production-grade systems that drive real economic value. Our agents are scalable systems built around recurring enterprise problem domains, with a strong emphasis on generalization, extensibility, and deployment across many customers.
Job Responsibility:
Design and implement end-to-end agent systems that combine LLM reasoning, tool use, memory, and control logic to solve recurring enterprise use cases
Build scalable, reliable agent architectures that can be deployed across many customers with varying data, tools, and constraints
Develop evaluation frameworks, datasets, environments, and metrics to measure agent performance, reliability, and business impact in production settings
Collaborate closely with product managers, customers, data annotators, and other engineering teams to translate enterprise requirements into robust agent designs
Productionize frontier agent techniques (e.g., planning, multi-step reasoning and tool-use, multi-agent patterns) into maintainable, observable systems
Own deployment, monitoring, and iteration of agent systems, including failure analysis and continuous improvement based on real-world usage
Contribute to technical direction and architectural decisions for general agent development best practices and methods, with increasing scope and leadership at the Staff level
Requirements:
5+ years of experience building and deploying machine learning or AI systems for real-world, production use cases
Strong engineering fundamentals, supported by a Bachelor’s and/or Master’s degree in Computer Science, Machine Learning, AI, or equivalent practical experience
Deep understanding of modern LLMs, prompt-, context-, and system-level optimization, and agentic system design
Proven proficiency in Python, including writing production-quality, testable, and maintainable code
Experience building systems that integrate models with external tools, APIs, databases, and services
Ability to operate in ambiguous problem spaces, balancing research-driven approaches with pragmatic product constraints
Strong communication skills and comfort working in customer-facing or cross-functional environments
Nice to have:
Hands-on experience building AI agents using modern generative AI stacks (OpenAI APIs, commercial or open-source LLMs)
Experience with agent frameworks, orchestration layers, or workflow systems (e.g., tool calling, planners, multi-agent setups)
Familiarity with evaluation, monitoring, and observability for LLM-powered systems in production
Experience deploying ML systems in cloud environments and operating them at scale
Experience fine-tuning or adapting foundation models using methods like supervised fine-tuning (SFT), reinforcement learning with verifiable rewards (RLVR), and low-rank adaptation (LoRA) to improve agent performance on domain-specific tasks
Interest in shaping the future of general-purpose enterprise agents and their real-world impact