CrawlJobs Logo

Applied Research - RL & Agents

Prime Intellect

Location Icon

Location:
United States , San Francisco

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

Prime Intellect builds the infrastructure that frontier AI labs build internally, and makes it available to everyone. Our platform, Lab, unifies environments, evaluations, sandboxes, and high-performance training into a single full-stack system for post-training at frontier scale, from RL and SFT to tool use, agent workflows, and deployment. We validate everything by using it ourselves, training open state-of-the-art models on the same stack we put in your hands. We're looking for people who want to build at the intersection of frontier research and real infrastructure.

Job Responsibility:

  • Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale
  • Building Robust Infrastructure: Developing the systems and frameworks that enable these agents to operate reliably, efficiently, and at massive scale
  • Bridge Between Applications & Research: Translate ambiguous objectives into clear technical requirements that guide product and research priorities
  • Prototype in the Field: Rapidly design and deploy agents, evals, and harnesses for real-world tasks to validate solutions
  • Application-Driven Research & Infrastructure: Shape the direction and feature set for verifiers, the Environments Hub, training services, and other research platform offerings
  • Build high‑quality examples, reference implementations, and “recipes” that make it easy for others to extend the stack
  • Prototype agents and eval harnesses tailored to real-world use cases and external systems
  • Pair with technical end‑users (research teams, infra‑heavy customers, open‑source contributors) to design environments, evals, and verifiers that reflect real workloads
  • Post-training & Reinforcement Learning: Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks
  • Build evaluations and harnesses and to measure reasoning, robustness, and agentic behavior in real-world workflows
  • Prototype multi-agent and memory-augmented systems to expand capabilities for downstream applications
  • Experiment with post-training recipes to optimize downstream performance
  • Agent Development & Infrastructure: Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making
  • Extend and integrate with agent frameworks to support evolving feature requests and performance requirements
  • Architect and maintain distributed training/inference pipelines, ensuring scalability and cost efficiency
  • Develop observability and monitoring (Prometheus, Grafana, tracing) to ensure reliability and performance in production deployments

Requirements:

  • Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment
  • Experience with agent frameworks and tooling (e.g. DSPy, LangGraph, MCP, Stagehand)
  • Familiarity with distributed training/inference frameworks (e.g., vLLM, sglang, Accelerate, Ray, Torch)
  • Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL
  • Passion for advancing the state-of-the-art in reasoning and building practical, agentic AI systems
  • Strong technical writing abilities (documentation, blogs, papers) and research taste
  • Eagerness to drive collaborations with external partners and engage with the broader open-source community

Nice to have:

  • Experience with web programming (React, TypeScript, Next.js)
  • Experience running LLM evaluations and/or synthetic data generation
  • Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform)
What we offer:
  • Competitive Compensation + equity incentives
  • Flexible Work (San Francisco or hybrid-remote)
  • Visa Sponsorship & relocation support
  • Professional Development budget
  • Team Off-sites & conference attendance

Additional Information:

Job Posted:
February 21, 2026

Employment Type:
Fulltime
Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Applied Research - RL & Agents

Research Scientist Intern

Meta is seeking Research Interns to join our Meta Superintelligence Lab in one o...
Location
Location
France , Paris
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has or is in the process of obtaining a Ph.D. degree in Computer Science, Artificial Intelligence, Generative AI, or a relevant technical field
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
  • Experience with Python, C++, C, Java or other related languages
  • Experience building systems based on machine learning and/or deep learning methods
Job Responsibility
Job Responsibility
  • Develop novel state-of-the-art agentic AI algorithms and corresponding systems, leveraging machine learning and reinforcement learning techniques
  • Conduct research on agentic LLMs, agentic RL environments, LLM post-training, and related topics
  • Analyze and improve the efficiency, scalability, and stability of agentic AI algorithms and deployed systems
  • Advance the science and technology of intelligent, agentic machines capable of reasoning, tool use, and personalized interactions
  • Collaborate with researchers and cross-functional partners, including communicating research plans, progress, and results
  • Disseminate research results through publications, presentations, and open source contributions
  • When applicable, contribute to research that can be applied to Meta product development
Read More
Arrow Right

Research Scientist Intern

Meta is seeking Research Interns to join our Meta Superintelligence Lab in one o...
Location
Location
France , Paris
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has or is in the process of obtaining a Masters degree in the field of Natural Language Processing, Machine Learning, Artificial Intelligence, or similar or relevant technical field
  • Research and/or work experience in Natural Language Processing
  • Research and/or work experience in Machine Learning or Deep Learning
  • Experience in Python, C++, or other related languages
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment
Job Responsibility
Job Responsibility
  • Develop novel state-of-the-art agentic AI algorithms and corresponding systems, leveraging machine learning and reinforcement learning techniques
  • Conduct research on agentic LLMs, agentic RL environments, LLM post-training, and related topics
  • Analyze and improve the efficiency, scalability, and stability of agentic AI algorithms and deployed systems
  • Advance the science and technology of intelligent, agentic machines capable of reasoning, tool use, and personalized interactions
  • Collaborate with researchers and cross-functional partners, including communicating research plans, progress, and results
  • Disseminate research results through publications, presentations, and open source contributions
  • When applicable, contribute to research that can be applied to Meta product development
Read More
Arrow Right
New

Staff Research Scientist - RL and Agents

We are building a team focusing on personal multi-agent systems, that will go ou...
Location
Location
Switzerland , Zurich
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science or a related field with published projects in the fields of machine learning, deep learning, robotics, large language models and/or computer vision
  • Proven development skills in Deep Learning, working with PyTorch or TensorFlow
  • Experience developing LLM algorithms or infrastructure in Python or C/C++
  • First-authored publications at peer-reviewed conferences, e.g. ICLR, ICML, CVPR, ECCV, ICCV, NeurIPS, SIGGRAPH
Job Responsibility
Job Responsibility
  • Conduct applied research to advance the state of the art in Language / Multimodal Models and Agentic systems
  • Consistently and sustainably advance the state of the art for your problem, including setting and executing against roadmaps for 6-month plus timeframes
  • Collaborate with different cross-functional teams across the globe in research and product
Read More
Arrow Right
New

Applied Research - Evals & Data

Prime Intellect builds the infrastructure that frontier AI labs build internally...
Location
Location
United States , San Francisco
Salary
Salary:
Not provided
Prime Intellect
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment
  • Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines)
  • Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate)
  • Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform)
  • Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL
  • Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems
Job Responsibility
Job Responsibility
  • Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale
  • Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale
  • Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities
  • Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions
  • Customer-Facing Engineering: Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks
  • Post-training & Reinforcement Learning: Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks
  • Agent Development & Infrastructure: Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making
What we offer
What we offer
  • Competitive Compensation + equity incentives
  • Flexible Work (remote or San Francisco)
  • Visa Sponsorship & relocation support
  • Professional Development budget
  • Team Off-sites & conference attendance
  • Fulltime
Read More
Arrow Right
New

Principal/Senior Applied Scientist Security Models Training Team - Next-Gen frontier research

The Security Models Training team is expanding to drive the development of a new...
Location
Location
Israel , Tel Aviv, Herzliya
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • M.Sc. / Ph.D. in Computer Science, Information Systems, Electrical or Computer Engineering or Data Science (Ph.D. strongly preferred)
  • Candidates with M.Sc. / Ph.D. in related fields with proven industry experience or a strong publication record in the areas of LLM, Information Retrieval, Machine Learning, Natural Language Processing, Time Series Forecasting and Deep Learning are considered as well
  • Proven hands-on experience of at least 5 years (including post-grad work) in building and deploying Machine Learning products
  • Key areas of expertise include Natural Language Processing and Large Language Models, along with an understanding of concepts such as Privacy and Responsible AI
  • Candidates are expected to demonstrate a strong history of successfully translating applied research into production-ready solutions, along with a proven track record of delivering projects within large-scale production environments
  • Proven expertise in the LLM and/or time-series forecasting domain, demonstrating comprehensive knowledge of relevant concepts in the domain
  • Ideal applicants should be proficient in areas such as LLM’s pre and post training, including CPT, SFT and RL, LLM benchmarking, agentic flows, and model alignment
  • Hands-on experience in building neural model architectures at the 100M+ scale and the proficiency to adapt them at all abstraction levels down the individual block (e.g. changing the innerworkings of an attention block, introducing new blocks, or changing the routings)
  • Demonstrated proficiency in problem-solving and data analysis, with substantial expertise in evaluating the performance of large language models (LLMs) and/or time-series forecasting models, developing benchmarks tailored to practical scenarios
Job Responsibility
Job Responsibility
  • Technical Leadership & Ownership: set technical direction for major security domain initiatives
  • lead security model programs spanning pre‑training, task tuning, reinforcement learning, and evaluation
  • translate cutting‑edge research into production‑ready capabilities
  • Advanced Model Design – Building and customizing deep learning model architectures (e.g., modifying transformer blocks, attention/memory modules, etc.) at the SLM/LLM scale
  • making principled architectural tradeoffs to improve reliability, robustness, and security‑specific behavior
  • Advanced Model Training – Apply deep expertise in pre-training, post-training, and reinforcement learning (RL) for both language and other modalities, including time-series
  • Design & Evaluate Datasets – Build high-quality datasets and benchmarks
  • define objective evaluation frameworks and quality gates
  • run ablation studies to measure impact and optimize data and training effectiveness to support confident product decisions
  • Develop Data Infrastructure – Create and maintain scalable pipelines for ingestion, preprocessing, filtering, and annotation of large, complex datasets, with attention to privacy, governance, and long‑term reuse across security scenarios
  • Fulltime
Read More
Arrow Right
New

Lead Applied Scientist Security Models Training Team

The Security Models Training team is expanding to drive the development of a new...
Location
Location
Israel , Tel Aviv, Herzliya
Salary
Salary:
Not provided
https://www.microsoft.com/ Logo
Microsoft Corporation
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • M.Sc. / Ph.D. in Computer Science, Information Systems, Electrical or Computer Engineering or Data Science (Ph.D. strongly preferred)
  • Candidates with M.Sc. / Ph.D. in related fields with proven industry experience or a strong publication record in the areas of LLM, Information Retrieval, Machine Learning, Natural Language Processing, Time Series Forecasting and Deep Learning are considered as well
  • Proven hands-on experience of at least 8 years (including post-grad work) in building and deploying Machine Learning products
  • Key areas of expertise include Natural Language Processing and Large Language Models, along with an understanding of concepts such as Privacy and Responsible AI
  • Candidates are expected to demonstrate a strong history of successfully translating applied research into production-ready solutions, along with a proven track record of delivering projects within large-scale production environments
  • Demonstrated ability to set long‑term technical strategy, align multiple teams, and serve as a technical decision‑maker for high‑risk, high‑impact investments
  • Proven expertise in the LLM and/or time-series forecasting domain, demonstrating comprehensive knowledge of relevant concepts in the domain
  • Ideal applicants should be proficient in areas such as LLM’s pre and post training, including CPT, SFT and RL, LLM benchmarking, agentic flows, and model alignment
  • Hands-on experience in building neural model architectures at the 100M+ scale and the proficiency to adapt them at all abstraction levels down the individual block (e.g. changing the innerworkings of an attention block, introducing new blocks, or changing the routings)
  • Demonstrated proficiency in problem-solving and data analysis, with substantial expertise in evaluating the performance of large language models (LLMs) and/or time-series forecasting models, developing benchmarks tailored to practical scenarios
Job Responsibility
Job Responsibility
  • Technical Leadership & Ownership: set technical direction for major security domain initiatives and align roadmaps across multiple teams
  • lead security model programs spanning pre‑training, task tuning, reinforcement learning, and evaluation
  • translate cutting‑edge research into production‑ready capabilities
  • This role influences portfolio‑level technical tradeoffs, investment prioritization, and long‑term architecture decisions for security models
  • Advanced Model Design – Building and customizing deep learning model architectures (e.g., modifying transformer blocks, attention/memory modules, etc.) at the SLM/LLM scale
  • making principled architectural tradeoffs to improve reliability, robustness, and security‑specific behavior
  • Advanced Model Training – Apply deep expertise in pre-training, post-training, and reinforcement learning (RL) for both language and other modalities, including time-series
  • Design & Evaluate Datasets – Build high-quality datasets and benchmarks
  • define objective evaluation frameworks and quality gates
  • run ablation studies to measure impact and optimize data and training effectiveness to support confident product decisions
  • Fulltime
Read More
Arrow Right

Thesis project: reinforcement learning environments for ai agents

Your thesis will connect to our ongoing work with Predli Studio and help shape h...
Location
Location
Sweden , Stockholm
Salary
Salary:
Not provided
predli.com Logo
Predli
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Enrolled in a master’s program in Machine Learning, AI, Data Science, Computer Science, or Engineering Physics (or a related field)
  • Curious, analytical, and eager to explore how AI can be applied in practice
  • Skilled in Python and Typescript
  • Confident in taking initiative, communicating ideas clearly, and working both independently and collaboratively
  • Excited to learn from and contribute to a small, high-impact team where knowledge sharing and experimentation are part of everyday life
  • Preferably based in Stockholm, with the possibility to work partly remote
  • Fluent in English
Job Responsibility
Job Responsibility
  • Focus on the development of advanced RL scenarios that challenge agent adaptability, generalization and decision-making under uncertainty, providing valuable insights into the capabilities and limitations of current RL approaches
  • Collaborate with engineers and researchers to define a clear scope that fits both academic requirements and ongoing applied AI work
What we offer
What we offer
  • Work alongside experienced AI engineers and researchers who will collaborate with you throughout your thesis
  • Get access to real-world data, infrastructure, and insights from ongoing AI projects
  • Contribute directly to the development of Predli Studio and help shape how organizations build and deploy AI in practice
  • Be part of a collaborative environment where learning, curiosity, and knowledge sharing are valued and encouraged
  • Gain exposure to both the consulting and product sides of applied AI
  • Possibility to continue your journey with us after your thesis
Read More
Arrow Right

Language Research Scientist

We are seeking a technically skilled GenAI scientist to join our team focused on...
Location
Location
Switzerland , Zurich
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has or is in the process of obtaining a Ph.D. degree in Computer Science, Artificial Intelligence, Generative AI, or a relevant technical field
  • Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
  • Good programming skills in Python and familiarity with large-scale distributed training
  • Familiarity to learn new programming languages quickly
  • Can design, implement, and evaluate RL algorithms in production or research settings
  • Problem-solving, communication, and collaboration skills
Job Responsibility
Job Responsibility
  • Design, implement, and optimize LLM-based agents for a variety of applications, leveraging the latest advances in generative AI
  • Apply reinforcement learning algorithms to improve LLM performance, safety, and alignment
  • Integrate models and orchestrations in production
  • Collaborate with cross-functional teams (research, engineering, product) to deploy and evaluate LLM agents in real-world scenarios
  • Analyze and interpret experimental results, iterate on model architectures, and drive continuous improvement
  • Contribute to the broader AI/ML community at Meta through knowledge sharing, code reviews, and technical mentorship
  • Lead and contribute to research and development of post-training methods, including RLHF (Reinforcement Learning from Human Feedback), reward modeling, and other feedback-based approaches
Read More
Arrow Right