Applied Research - RL & Agents Job at Prime Intellect (San Francisco)

Research Scientist Intern

Meta is seeking Research Interns to join our Meta Superintelligence Lab in one o...

Location

France , Paris

Salary:

Not provided

Meta

Expiration Date

Until further notice

Requirements

Currently has or is in the process of obtaining a Ph.D. degree in Computer Science, Artificial Intelligence, Generative AI, or a relevant technical field
Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
Experience with Python, C++, C, Java or other related languages
Experience building systems based on machine learning and/or deep learning methods

Job Responsibility

Develop novel state-of-the-art agentic AI algorithms and corresponding systems, leveraging machine learning and reinforcement learning techniques
Conduct research on agentic LLMs, agentic RL environments, LLM post-training, and related topics
Analyze and improve the efficiency, scalability, and stability of agentic AI algorithms and deployed systems
Advance the science and technology of intelligent, agentic machines capable of reasoning, tool use, and personalized interactions
Collaborate with researchers and cross-functional partners, including communicating research plans, progress, and results
Disseminate research results through publications, presentations, and open source contributions
When applicable, contribute to research that can be applied to Meta product development

Research Scientist Intern

Meta is seeking Research Interns to join our Meta Superintelligence Lab in one o...

Location

France , Paris

Salary:

Not provided

Meta

Expiration Date

Until further notice

Requirements

Currently has or is in the process of obtaining a Masters degree in the field of Natural Language Processing, Machine Learning, Artificial Intelligence, or similar or relevant technical field
Research and/or work experience in Natural Language Processing
Research and/or work experience in Machine Learning or Deep Learning
Experience in Python, C++, or other related languages
Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment

Job Responsibility

Develop novel state-of-the-art agentic AI algorithms and corresponding systems, leveraging machine learning and reinforcement learning techniques
Conduct research on agentic LLMs, agentic RL environments, LLM post-training, and related topics
Analyze and improve the efficiency, scalability, and stability of agentic AI algorithms and deployed systems
Advance the science and technology of intelligent, agentic machines capable of reasoning, tool use, and personalized interactions
Collaborate with researchers and cross-functional partners, including communicating research plans, progress, and results
Disseminate research results through publications, presentations, and open source contributions
When applicable, contribute to research that can be applied to Meta product development

New

Staff Research Scientist - RL and Agents

We are building a team focusing on personal multi-agent systems, that will go ou...

Location

Switzerland , Zurich

Salary:

Not provided

Meta

Expiration Date

Until further notice

Requirements

PhD in Computer Science or a related field with published projects in the fields of machine learning, deep learning, robotics, large language models and/or computer vision
Proven development skills in Deep Learning, working with PyTorch or TensorFlow
Experience developing LLM algorithms or infrastructure in Python or C/C++
First-authored publications at peer-reviewed conferences, e.g. ICLR, ICML, CVPR, ECCV, ICCV, NeurIPS, SIGGRAPH

Job Responsibility

Conduct applied research to advance the state of the art in Language / Multimodal Models and Agentic systems
Consistently and sustainably advance the state of the art for your problem, including setting and executing against roadmaps for 6-month plus timeframes
Collaborate with different cross-functional teams across the globe in research and product

New

Applied Research - Evals & Data

Prime Intellect builds the infrastructure that frontier AI labs build internally...

Location

United States , San Francisco

Salary:

Not provided

Prime Intellect

Expiration Date

Until further notice

Requirements

Strong background in machine learning engineering, with experience in post-training, RL, or large-scale model alignment
Experience with applied data workflows and evaluation frameworks for large models or agents (e.g., SWE-Bench, HELM, EvalFlow, internal eval pipelines)
Deep expertise in distributed training/inference frameworks (e.g., vLLM, sglang, Ray, Accelerate)
Experience deploying containerized systems at scale (Docker, Kubernetes, Terraform)
Track record of research contributions (publications, open-source contributions, benchmarks) in ML/RL
Passion for advancing the state-of-the-art in reasoning, measurement, and building practical, agentic AI systems

Job Responsibility

Advancing Agent Capabilities: Designing and iterating on next-generation AI agents that tackle real workloads—workflow automation, reasoning-intensive tasks, and decision-making at scale
Building Robust Infrastructure: Developing the distributed systems, evaluation pipelines, and coordination frameworks that enable these agents to operate reliably, efficiently, and at massive scale
Bridge Between Customers & Research: Translating customer needs and insights from applied data into clear technical requirements that guide product and research priorities
Prototype in the Field: Rapidly designing and deploying agents, evals, and harnesses alongside customers to validate solutions
Customer-Facing Engineering: Work side-by-side with customers to deeply understand workflows, data sources, and bottlenecks
Post-training & Reinforcement Learning: Design and implement novel RL and post-training methods (RLHF, RLVR, GRPO, etc.) to align large models with domain-specific tasks
Agent Development & Infrastructure: Rapidly prototype and iterate on AI agents for automation, workflow orchestration, and decision-making

What we offer

Competitive Compensation + equity incentives
Flexible Work (remote or San Francisco)
Visa Sponsorship & relocation support
Professional Development budget
Team Off-sites & conference attendance

Fulltime

New

Principal/Senior Applied Scientist Security Models Training Team - Next-Gen frontier research

The Security Models Training team is expanding to drive the development of a new...

Location

Israel , Tel Aviv, Herzliya

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

M.Sc. / Ph.D. in Computer Science, Information Systems, Electrical or Computer Engineering or Data Science (Ph.D. strongly preferred)
Candidates with M.Sc. / Ph.D. in related fields with proven industry experience or a strong publication record in the areas of LLM, Information Retrieval, Machine Learning, Natural Language Processing, Time Series Forecasting and Deep Learning are considered as well
Proven hands-on experience of at least 5 years (including post-grad work) in building and deploying Machine Learning products
Key areas of expertise include Natural Language Processing and Large Language Models, along with an understanding of concepts such as Privacy and Responsible AI
Candidates are expected to demonstrate a strong history of successfully translating applied research into production-ready solutions, along with a proven track record of delivering projects within large-scale production environments
Proven expertise in the LLM and/or time-series forecasting domain, demonstrating comprehensive knowledge of relevant concepts in the domain
Ideal applicants should be proficient in areas such as LLM’s pre and post training, including CPT, SFT and RL, LLM benchmarking, agentic flows, and model alignment
Hands-on experience in building neural model architectures at the 100M+ scale and the proficiency to adapt them at all abstraction levels down the individual block (e.g. changing the innerworkings of an attention block, introducing new blocks, or changing the routings)
Demonstrated proficiency in problem-solving and data analysis, with substantial expertise in evaluating the performance of large language models (LLMs) and/or time-series forecasting models, developing benchmarks tailored to practical scenarios

Job Responsibility

Technical Leadership & Ownership: set technical direction for major security domain initiatives
lead security model programs spanning pre‑training, task tuning, reinforcement learning, and evaluation
translate cutting‑edge research into production‑ready capabilities
Advanced Model Design – Building and customizing deep learning model architectures (e.g., modifying transformer blocks, attention/memory modules, etc.) at the SLM/LLM scale
making principled architectural tradeoffs to improve reliability, robustness, and security‑specific behavior
Advanced Model Training – Apply deep expertise in pre-training, post-training, and reinforcement learning (RL) for both language and other modalities, including time-series
Design & Evaluate Datasets – Build high-quality datasets and benchmarks
define objective evaluation frameworks and quality gates
run ablation studies to measure impact and optimize data and training effectiveness to support confident product decisions
Develop Data Infrastructure – Create and maintain scalable pipelines for ingestion, preprocessing, filtering, and annotation of large, complex datasets, with attention to privacy, governance, and long‑term reuse across security scenarios

Fulltime

New

Lead Applied Scientist Security Models Training Team

The Security Models Training team is expanding to drive the development of a new...

Location

Israel , Tel Aviv, Herzliya

Salary:

Not provided

Microsoft Corporation

Expiration Date

Until further notice

Requirements

M.Sc. / Ph.D. in Computer Science, Information Systems, Electrical or Computer Engineering or Data Science (Ph.D. strongly preferred)
Candidates with M.Sc. / Ph.D. in related fields with proven industry experience or a strong publication record in the areas of LLM, Information Retrieval, Machine Learning, Natural Language Processing, Time Series Forecasting and Deep Learning are considered as well
Proven hands-on experience of at least 8 years (including post-grad work) in building and deploying Machine Learning products
Key areas of expertise include Natural Language Processing and Large Language Models, along with an understanding of concepts such as Privacy and Responsible AI
Candidates are expected to demonstrate a strong history of successfully translating applied research into production-ready solutions, along with a proven track record of delivering projects within large-scale production environments
Demonstrated ability to set long‑term technical strategy, align multiple teams, and serve as a technical decision‑maker for high‑risk, high‑impact investments
Proven expertise in the LLM and/or time-series forecasting domain, demonstrating comprehensive knowledge of relevant concepts in the domain
Ideal applicants should be proficient in areas such as LLM’s pre and post training, including CPT, SFT and RL, LLM benchmarking, agentic flows, and model alignment
Hands-on experience in building neural model architectures at the 100M+ scale and the proficiency to adapt them at all abstraction levels down the individual block (e.g. changing the innerworkings of an attention block, introducing new blocks, or changing the routings)
Demonstrated proficiency in problem-solving and data analysis, with substantial expertise in evaluating the performance of large language models (LLMs) and/or time-series forecasting models, developing benchmarks tailored to practical scenarios

Job Responsibility

Technical Leadership & Ownership: set technical direction for major security domain initiatives and align roadmaps across multiple teams
lead security model programs spanning pre‑training, task tuning, reinforcement learning, and evaluation
translate cutting‑edge research into production‑ready capabilities
This role influences portfolio‑level technical tradeoffs, investment prioritization, and long‑term architecture decisions for security models
Advanced Model Design – Building and customizing deep learning model architectures (e.g., modifying transformer blocks, attention/memory modules, etc.) at the SLM/LLM scale
making principled architectural tradeoffs to improve reliability, robustness, and security‑specific behavior
Advanced Model Training – Apply deep expertise in pre-training, post-training, and reinforcement learning (RL) for both language and other modalities, including time-series
Design & Evaluate Datasets – Build high-quality datasets and benchmarks
define objective evaluation frameworks and quality gates
run ablation studies to measure impact and optimize data and training effectiveness to support confident product decisions

Fulltime

Thesis project: reinforcement learning environments for ai agents

Your thesis will connect to our ongoing work with Predli Studio and help shape h...

Location

Sweden , Stockholm

Salary:

Not provided

Predli

Expiration Date

Until further notice

Requirements

Enrolled in a master’s program in Machine Learning, AI, Data Science, Computer Science, or Engineering Physics (or a related field)
Curious, analytical, and eager to explore how AI can be applied in practice
Skilled in Python and Typescript
Confident in taking initiative, communicating ideas clearly, and working both independently and collaboratively
Excited to learn from and contribute to a small, high-impact team where knowledge sharing and experimentation are part of everyday life
Preferably based in Stockholm, with the possibility to work partly remote
Fluent in English

Job Responsibility

Focus on the development of advanced RL scenarios that challenge agent adaptability, generalization and decision-making under uncertainty, providing valuable insights into the capabilities and limitations of current RL approaches
Collaborate with engineers and researchers to define a clear scope that fits both academic requirements and ongoing applied AI work

What we offer

Work alongside experienced AI engineers and researchers who will collaborate with you throughout your thesis
Get access to real-world data, infrastructure, and insights from ongoing AI projects
Contribute directly to the development of Predli Studio and help shape how organizations build and deploy AI in practice
Be part of a collaborative environment where learning, curiosity, and knowledge sharing are valued and encouraged
Gain exposure to both the consulting and product sides of applied AI
Possibility to continue your journey with us after your thesis

Language Research Scientist

We are seeking a technically skilled GenAI scientist to join our team focused on...

Location

Switzerland , Zurich

Salary:

Not provided

Meta

Expiration Date

Until further notice

Requirements

Currently has or is in the process of obtaining a Ph.D. degree in Computer Science, Artificial Intelligence, Generative AI, or a relevant technical field
Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent practical experience)
Good programming skills in Python and familiarity with large-scale distributed training
Familiarity to learn new programming languages quickly
Can design, implement, and evaluate RL algorithms in production or research settings
Problem-solving, communication, and collaboration skills

Job Responsibility

Design, implement, and optimize LLM-based agents for a variety of applications, leveraging the latest advances in generative AI
Apply reinforcement learning algorithms to improve LLM performance, safety, and alignment
Integrate models and orchestrations in production
Collaborate with cross-functional teams (research, engineering, product) to deploy and evaluate LLM agents in real-world scenarios
Analyze and interpret experimental results, iterate on model architectures, and drive continuous improvement
Contribute to the broader AI/ML community at Meta through knowledge sharing, code reviews, and technical mentorship
Lead and contribute to research and development of post-training methods, including RLHF (Reinforcement Learning from Human Feedback), reward modeling, and other feedback-based approaches

Applied Research - RL & Agents

Prime Intellect

Location:
United States , San Francisco

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
February 21, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Applied Research - RL & Agents

Research Scientist Intern

Research Scientist Intern

Staff Research Scientist - RL and Agents

Applied Research - Evals & Data

Principal/Senior Applied Scientist Security Models Training Team - Next-Gen frontier research

Lead Applied Scientist Security Models Training Team

Thesis project: reinforcement learning environments for ai agents

Language Research Scientist

Applied Research - RL & Agents

Prime Intellect

Location:United States , San Francisco

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:February 21, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Applied Research - RL & Agents

Research Scientist Intern

Research Scientist Intern

Staff Research Scientist - RL and Agents

Applied Research - Evals & Data

Principal/Senior Applied Scientist Security Models Training Team - Next-Gen frontier research

Lead Applied Scientist Security Models Training Team

Thesis project: reinforcement learning environments for ai agents

Language Research Scientist

Location:
United States , San Francisco

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
February 21, 2026