CrawlJobs Logo

Reinforcement learning intern

France, Paris · Job Posted December 08, 2025
Apply Position
Job Link Share

Job Description

As a Reinforcement Learning Intern, you will help develop and implement learning-based navigation and control algorithms for the Mirokai humanoid robot, which balances dynamically on a ball. You will work closely with the team to extend our simulation environments, train agents, and validate policies on real hardware. This internship offers deep hands-on experience in RL for real-world robotics — from simulation to deployment.

Job Responsibility

  • Develop, debug, and test reinforcement learning algorithms for locomotion and navigation on a dynamically balancing base
  • Extend simulation environments (Isaac Sim / Isaac Lab) to support training and evaluation of RL policies
  • Integrate trained policies into the Mirokai software stack and validate them on physical robots
  • Analyze performance, stability, and sim-to-real transfer aspects
  • Stay up to date with recent research in reinforcement learning for robotics

Requirements

  • BSc holder in Robotics, Engineering, Computer Science, or related field
  • Coursework or project experience in reinforcement learning or learning-based control
  • Strong Python skills and knowledge of a deep learning framework PyTorch, JAX, or TensorFlow
  • Familiarity with simulation environments such as Isaac Sim, Mujoco, or Gazebo
  • Solid analytical and problem-solving abilities

Looking for more opportunities?

Search for other job offers that match your skills and interests.

Similar Jobs for

Reinforcement learning intern

8 matching positions

Research Scientist Intern, Reinforcement Learning

We are committed to advancing the field of artificial intelligence by making fun...
Location
Location
France , Paris
Salary
Salary:
Not provided
meta.com Logo
Meta
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently has or is in the process of obtaining a PhD degree in Machine Learning, Artificial Intelligence, Computer Science, Reinforcement Learning, Mathematics, or relevant technical field
  • Solid background on the foundations of reinforcement learning
  • Ability to implement and run reinforcement learning algorithms in complex environments
  • Experience collaborating within a team to solve complex problems
  • Must obtain work authorization in the country of employment at the time of hire and maintain ongoing work authorization during employment
  • Experience with Python, C++, C, Java or other related languages
  • Experience with deep learning frameworks such as Pytorch or JAX
  • Experience building systems based on machine learning and/or deep learning methods
  • Research experience with algorithms for sequential decision-making, e.g., planning, reinforcement learning, or similar
Job Responsibility
Job Responsibility
  • Develop novel state-of-the-art reinforcement learning algorithms and corresponding systems, leveraging various deep learning techniques
  • Analyze and improve efficiency, scalability, and stability of corresponding deployed algorithms
  • Perform state of the art research to advance the science and technology of Machine Learning and Artificial Intelligence
  • Collaborate with researchers and cross-functional partners including communicating research plans, progress, and results
  • Contribute to research that can be applied to Meta product development
Read More
Arrow Right

Research Scientist Intern, Reinforcement Learning

We’re looking for a curious and motivated Reinforcement Learning Intern to help ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently pursuing a PhD or Masters in Computer Science, Robotics, Electrical Engineering, or a related field, with a focus on Machine Learning, AI, or Computer Vision
  • Experience in research in Reinforcement Learning
  • Interest in one or more: synthetic data, representation learning, and Offline RL
  • Comfortable working in Python and libraries like PyTorch, NumPy, and Pandas
  • A principled mindset: you enjoy brainstorming, making assumptions, building, testing, and iterating on ideas to see what works
Job Responsibility
Job Responsibility
  • Help advance the next generation of decision-making systems for autonomous driving
  • Work embedded in a research team to develop scalable RL algorithms that enable vehicles to learn complex behaviors directly from experience — both in simulation and the real world
What we offer
What we offer
  • Competitive compensation and benefits
  • A dynamic and fast-paced work environment in which you will grow every day - learning on the job, from the a diverse team of the brightest researchers and engineers in this space
  • A culture that is ego-free, respectful and welcoming
  • Potential to publish your research work at a top flight conference
  • The chance to be part of a truly mission driven organisation and an opportunity to shape the future of autonomous driving
Read More
Arrow Right

PhD Autonomy Engineer Intern - Planning & Controls (Reinforcement Learning)

Skydio builds the world’s most advanced autonomous drones used across inspection...
Location
Location
Switzerland , Zurich
Salary
Salary:
50.00 EUR / Hour
skydio.com Logo
Skydio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD student in Robotics, Machine Learning, Controls, or related field
  • Strong fundamentals in RL, control theory, and motion planning
  • comfort with safety/robustness concepts
  • Proficient in Python (PyTorch/JAX/Ray RLlib) and at least one of C++ or CUDA
  • Hands-on experience with robotics simulation (Isaac Lab/MuJoCo/PyBullet) and sim2real techniques
  • Experience training/deploying policies for navigation, manipulation, or locomotion on real robots or autonomous vehicles
Job Responsibility
Job Responsibility
  • Develop and deploy reinforcement learning (and adjacent policy-learning methods) that make Skydio aircraft plan, navigate, and control themselves more intelligently—safely, reliably, and efficiently—across our ecosystem: handheld apps, ground control, cloud autonomy services, and fleet workflows
  • Navigation & avoidance in the wild: Train policies that adapt online to cluttered 3D scenes (forests, bridges, urban canyons), complementing our geometric stack for robust obstacle avoidance and dynamic goal-seeking
  • RL-augmented planning: Fuse learned cost shaping / value functions with trajectory optimization for smooth, agile flight with tight safety envelopes and mission constraints
  • Sim → Real at scale: Build scalable datasets and training loops with Isaac Lab, domain randomization, residual learning, and safety filters
  • validate on real drones weekly
  • Human-in-the-loop shared control: Learn assistive policies that blend pilot intent, autonomy priors, and uncertainty-aware behaviors for intuitive control handoffs
  • Fleet & multi-agent: Explore decentralized coordination for coverage, pursuit, and collaborative mapping with minimal comms
Read More
Arrow Right

Senior Engineering Manager, Reinforcement Learning Environments (RLE)

We’re expanding our team and seeking a Senior Engineering Manager to lead our Re...
Location
Location
United States , San Francisco
Salary
Salary:
230000.00 - 300000.00 USD / Year
joinhandshake.com Logo
Handshake
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 3+ years of engineering management experience, with increasing scope and ownership
  • Experience managing senior engineers
  • experience managing an Engineering Manager (or equivalent scope) strongly preferred
  • 5+ years of prior hands-on engineering experience
  • Strong technical background in platform systems, distributed systems, or full-stack infrastructure
  • Experience building internal platforms, data pipelines, or research-facing tools
  • Proven ability to operate effectively in fast-paced, ambiguous environments
  • Experience driving cross-functional alignment across engineering, research, and operations
  • Willingness to work in-office in San Francisco 5 days/week
Job Responsibility
Job Responsibility
  • Lead and grow a high-performing team of 8–9 engineers building reinforcement learning environments
  • Manage, mentor, and develop senior engineers and future engineering leaders
  • Partner closely with research, product, and operations teams to define roadmap and execution priorities
  • Drive technical architecture for scalable, reliable, and extensible environment systems
  • Build plug-and-play environments that integrate seamlessly with model training pipelines
  • Balance platform rigor with operational complexity and data quality requirements
  • Establish engineering best practices around reliability, observability, and performance
  • Foster a culture of ownership, velocity, and high technical standards
What we offer
What we offer
  • Equity in a fast-growing company
  • 401(k) match, competitive compensation, financial coaching
  • Paid parental leave, fertility benefits, parental coaching
  • Medical, dental, and vision, mental health support, $500 wellness stipend
  • $2,000 learning stipend, ongoing development
  • Internet, commuting, and free lunch/gym in our SF office
  • Flexible PTO, 15 holidays + 2 flex days
  • Team outings & referral bonuses
  • Fulltime
Read More
Arrow Right

Machine Learning Intern

Ema is building next-generation AI to empower every employee in the enterprise t...
Location
Location
United States , San Francisco Bay Area
Salary
Salary:
Not provided
ema.co Logo
Ema
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • 2+ years of industry experience in a software engineering, data science, or ML role (full-time or internship)
  • Strong coding skills in Python and experience with ML frameworks like PyTorch or TensorFlow
  • A solid foundation in data structures, algorithms, and software engineering principles
  • Exposure to NLP, deep learning, reinforcement learning, or retrieval systems
  • Experience working with real-world data, and comfort with SQL and data processing pipelines
  • Curiosity about MLOps, cloud infrastructure, and scaling ML models
  • A collaborative mindset, eagerness to learn, and the ability to thrive in a fast-paced environment
Job Responsibility
Job Responsibility
  • Work alongside senior ML engineers to research, build, and deploy machine learning models across NLP, retrieval, ranking, and reasoning
  • Prototype and experiment with LLM-based architectures and agentic systems
  • Help process and analyze large-scale structured and unstructured datasets
  • Build data pipelines, contribute to model training and evaluation, and participate in the deployment of models in production
  • Assist with validation experiments such as A/B testing and other evaluation methods to ensure robustness and reliability
  • Collaborate closely with cross-functional teams and participate in technical discussions and code reviews
  • Fulltime
Read More
Arrow Right

AI Research Engineer - Reinforcement Learning

At Helsing we deliver AI-based capabilities and the enabling infrastructure that...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
helsing.ai Logo
Helsing
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hold MSc in machine learning with a speciality in either reinforcement learning, multi-agent systems, automation and control, or robotics
  • Have excellent communication skills and the ability to report and present research findings clearly and efficiently both internally and externally
  • Are passionate about keeping up-to-date with current research and enjoy reimplementing / extending papers on state-of-the-art Deep Learning-based approaches
  • Possess solid software engineering skills, writing clean and well-structured code in Python and/or languages like Rust, Java, or modern C++, and experience deploying AI software to production including testing, QA, and monitoring
Job Responsibility
Job Responsibility
  • Design, train and deploy agents in complex multi-agent environments
  • Contribute to our reinforcement learning stack by implementing, improving and extending the current state of the art in multi-agent reinforcement learning
  • Be a part of impactful projects and will collaborate with people across several teams and backgrounds to integrate cutting edge ML/AI in our production systems
What we offer
What we offer
  • Competitive compensation and stock options
  • Relocation support
  • Social and education allowances
  • Regular company events and all-hands to bring together employees as one team across Europe
  • A hands-on onboarding program (affectionately labelled “AI-duction”), in which you will be familiarising yourself with our tools and ML pipelines used across the company
  • Fulltime
Read More
Arrow Right

Associate Director, Reinforcement Learning (ML)

Lead Amgen’s strategy and execution for Reinforcement Learning from Human Feedba...
Location
Location
United States , Thousand Oaks; Jacksonville
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate degree and 3 years of Computer Science, IT or related field experience
  • Master’s degree and 5 years of Computer Science, IT or related field experience
  • Bachelor’s degree and 7 years of Computer Science, IT or related field experience
  • Associate’s degree and 12 years of Computer Science, IT or related field experience
  • High school diploma / GED and 14 years of Computer Science, IT or related field experience
  • Deep, hands-on expertise in Reinforcement Learning from Human Feedback (RLHF) and/or advanced reinforcement learning, including reward modeling, policy optimization, exploration strategies, and offline/online evaluation
  • Demonstrated experience deploying RLHF or RL systems into production for real-world applications (e.g., large language models, recommendation systems, decision support tools, or workflow automation), ideally in healthcare, life sciences, or other regulated domains
  • Strong background in modern machine learning and deep learning, with practical experience in Python and frameworks such as PyTorch or TensorFlow, and familiarity with LLM ecosystems and tooling
  • Experience driving sophisticated, cross-functional initiatives, collaborating with non-technical stakeholders (e.g., physicians, scientists, commercial leaders, compliance, legal) and translating needs into impactful AI solutions
  • Strong ability to communicate complex technical topics simply, tailoring content to senior executives and non-technical audiences
Job Responsibility
Job Responsibility
  • Lead the design and development of RLHF systems including reward modeling, policy optimization, safety and alignment mechanisms, and evaluation frameworks for large language models and other AI systems
  • Drive hands-on technical execution, particularly for high-impact projects, reviewing architectures, experimentation plans, and code, and helping the team navigate scientific and engineering trade-offs
  • Establish best-practice pipelines for human feedback, partnering closely with internal customer teams to define feedback protocols, annotation quality standards, and governance for RLHF data
  • Define and track success metrics for RLHF systems, balancing offline and online evaluation, A/B tests, safety and robustness criteria, and business or scientific outcomes
  • Collaborate across Amgen leaders to ensure RLHF solutions are aligned with strategy, compliant with policy, and integrated into real workflows
  • Partner with Data, Platform and Technology teams to ensure that RLHF workloads are supported by scalable data platforms, model hosting, experimentation infrastructure, and MLOps best practices
  • Champion responsible and compliant AI, working with Legal, Compliance, and Information Security to implement governance around human feedback, data usage, model behavior, transparency, and risk management in a regulated environment
  • Communicate insights and influence senior stakeholders, creating clear narratives, roadmaps, and recommendations that help executives understand RLHF trade-offs, risks, and opportunities
What we offer
What we offer
  • A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
  • A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
  • Stock-based long-term incentives
  • Award-winning time-off plans
  • Flexible work models where possible
Read More
Arrow Right

Machine Learning Engineering Intern

Airbnb is seeking a Machine Learning Engineer Intern for our 2026 Summer Intern ...
Location
Location
United States
Salary
Salary:
20.00 - 50.00 USD / Hour
airbnb.com Logo
Airbnb
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate students with at least 1 semester of school remaining after the internship (expected graduation after December 2026)
  • Studying Computer Science or a related field
  • Knowledge of AI, especially large language model (LLM) fundamentals including fine tuning, data synthesis, reinforcement learning, etc.
  • Expertise of building large scale ReAct style AI agents
  • Proficient in Python and other open source LLM libraries like Huggingface
  • Work authorization for employment in the United States is required
Job Responsibility
Job Responsibility
  • Drive a project from end-to-end that aligns with your technical interest and goals
  • Exploring how new advancements in AI agents can be applied to Airbnb search and discovery experience
  • Own and drive a capstone project from beginning to end
  • Collaborate with multiple team members to achieve project milestones
  • Communicate with stakeholders across different teams to provide project updates
  • Seek and provide feedback throughout the internship
  • Actively participate in and contribute to the Engineering org and broader Airbnb community
What we offer
What we offer
  • Employee Travel Credits
Read More
Arrow Right