CrawlJobs Logo

Reinforcement learning intern

enchanted.tools Logo

Enchanted Tools

Location Icon

Location:
France , Paris

Category Icon

Job Type Icon

Contract Type:
Not provided

Salary Icon

Salary:

Not provided

Job Description:

As a Reinforcement Learning Intern, you will help develop and implement learning-based navigation and control algorithms for the Mirokai humanoid robot, which balances dynamically on a ball. You will work closely with the team to extend our simulation environments, train agents, and validate policies on real hardware. This internship offers deep hands-on experience in RL for real-world robotics — from simulation to deployment.

Job Responsibility:

  • Develop, debug, and test reinforcement learning algorithms for locomotion and navigation on a dynamically balancing base
  • Extend simulation environments (Isaac Sim / Isaac Lab) to support training and evaluation of RL policies
  • Integrate trained policies into the Mirokai software stack and validate them on physical robots
  • Analyze performance, stability, and sim-to-real transfer aspects
  • Stay up to date with recent research in reinforcement learning for robotics

Requirements:

  • BSc holder in Robotics, Engineering, Computer Science, or related field
  • Coursework or project experience in reinforcement learning or learning-based control
  • Strong Python skills and knowledge of a deep learning framework PyTorch, JAX, or TensorFlow
  • Familiarity with simulation environments such as Isaac Sim, Mujoco, or Gazebo
  • Solid analytical and problem-solving abilities

Additional Information:

Job Posted:
December 08, 2025

Work Type:
On-site work
Job Link Share:

Looking for more opportunities? Search for other job offers that match your skills and interests.

Briefcase Icon

Similar Jobs for Reinforcement learning intern

PhD Autonomy Engineer Intern - Planning & Controls (Reinforcement Learning)

Skydio builds the world’s most advanced autonomous drones used across inspection...
Location
Location
Switzerland , Zurich
Salary
Salary:
50.00 EUR / Hour
skydio.com Logo
Skydio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD student in Robotics, Machine Learning, Controls, or related field
  • Strong fundamentals in RL, control theory, and motion planning
  • comfort with safety/robustness concepts
  • Proficient in Python (PyTorch/JAX/Ray RLlib) and at least one of C++ or CUDA
  • Hands-on experience with robotics simulation (Isaac Lab/MuJoCo/PyBullet) and sim2real techniques
  • Experience training/deploying policies for navigation, manipulation, or locomotion on real robots or autonomous vehicles
Job Responsibility
Job Responsibility
  • Develop and deploy reinforcement learning (and adjacent policy-learning methods) that make Skydio aircraft plan, navigate, and control themselves more intelligently—safely, reliably, and efficiently—across our ecosystem: handheld apps, ground control, cloud autonomy services, and fleet workflows
  • Navigation & avoidance in the wild: Train policies that adapt online to cluttered 3D scenes (forests, bridges, urban canyons), complementing our geometric stack for robust obstacle avoidance and dynamic goal-seeking
  • RL-augmented planning: Fuse learned cost shaping / value functions with trajectory optimization for smooth, agile flight with tight safety envelopes and mission constraints
  • Sim → Real at scale: Build scalable datasets and training loops with Isaac Lab, domain randomization, residual learning, and safety filters
  • validate on real drones weekly
  • Human-in-the-loop shared control: Learn assistive policies that blend pilot intent, autonomy priors, and uncertainty-aware behaviors for intuitive control handoffs
  • Fleet & multi-agent: Explore decentralized coordination for coverage, pursuit, and collaborative mapping with minimal comms
Read More
Arrow Right

AI Research Engineer - Reinforcement Learning

At Helsing we deliver AI-based capabilities and the enabling infrastructure that...
Location
Location
Germany , Munich
Salary
Salary:
Not provided
helsing.ai Logo
Helsing
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Hold MSc in machine learning with a speciality in either reinforcement learning, multi-agent systems, automation and control, or robotics
  • Have excellent communication skills and the ability to report and present research findings clearly and efficiently both internally and externally
  • Are passionate about keeping up-to-date with current research and enjoy reimplementing / extending papers on state-of-the-art Deep Learning-based approaches
  • Possess solid software engineering skills, writing clean and well-structured code in Python and/or languages like Rust, Java, or modern C++, and experience deploying AI software to production including testing, QA, and monitoring
Job Responsibility
Job Responsibility
  • Design, train and deploy agents in complex multi-agent environments
  • Contribute to our reinforcement learning stack by implementing, improving and extending the current state of the art in multi-agent reinforcement learning
  • Be a part of impactful projects and will collaborate with people across several teams and backgrounds to integrate cutting edge ML/AI in our production systems
What we offer
What we offer
  • Competitive compensation and stock options
  • Relocation support
  • Social and education allowances
  • Regular company events and all-hands to bring together employees as one team across Europe
  • A hands-on onboarding program (affectionately labelled “AI-duction”), in which you will be familiarising yourself with our tools and ML pipelines used across the company
  • Fulltime
Read More
Arrow Right

Associate Director, Reinforcement Learning (ML)

Lead Amgen’s strategy and execution for Reinforcement Learning from Human Feedba...
Location
Location
United States , Thousand Oaks; Jacksonville
Salary
Salary:
Not provided
amgen.com Logo
Amgen
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Doctorate degree and 3 years of Computer Science, IT or related field experience
  • Master’s degree and 5 years of Computer Science, IT or related field experience
  • Bachelor’s degree and 7 years of Computer Science, IT or related field experience
  • Associate’s degree and 12 years of Computer Science, IT or related field experience
  • High school diploma / GED and 14 years of Computer Science, IT or related field experience
  • Deep, hands-on expertise in Reinforcement Learning from Human Feedback (RLHF) and/or advanced reinforcement learning, including reward modeling, policy optimization, exploration strategies, and offline/online evaluation
  • Demonstrated experience deploying RLHF or RL systems into production for real-world applications (e.g., large language models, recommendation systems, decision support tools, or workflow automation), ideally in healthcare, life sciences, or other regulated domains
  • Strong background in modern machine learning and deep learning, with practical experience in Python and frameworks such as PyTorch or TensorFlow, and familiarity with LLM ecosystems and tooling
  • Experience driving sophisticated, cross-functional initiatives, collaborating with non-technical stakeholders (e.g., physicians, scientists, commercial leaders, compliance, legal) and translating needs into impactful AI solutions
  • Strong ability to communicate complex technical topics simply, tailoring content to senior executives and non-technical audiences
Job Responsibility
Job Responsibility
  • Lead the design and development of RLHF systems including reward modeling, policy optimization, safety and alignment mechanisms, and evaluation frameworks for large language models and other AI systems
  • Drive hands-on technical execution, particularly for high-impact projects, reviewing architectures, experimentation plans, and code, and helping the team navigate scientific and engineering trade-offs
  • Establish best-practice pipelines for human feedback, partnering closely with internal customer teams to define feedback protocols, annotation quality standards, and governance for RLHF data
  • Define and track success metrics for RLHF systems, balancing offline and online evaluation, A/B tests, safety and robustness criteria, and business or scientific outcomes
  • Collaborate across Amgen leaders to ensure RLHF solutions are aligned with strategy, compliant with policy, and integrated into real workflows
  • Partner with Data, Platform and Technology teams to ensure that RLHF workloads are supported by scalable data platforms, model hosting, experimentation infrastructure, and MLOps best practices
  • Champion responsible and compliant AI, working with Legal, Compliance, and Information Security to implement governance around human feedback, data usage, model behavior, transparency, and risk management in a regulated environment
  • Communicate insights and influence senior stakeholders, creating clear narratives, roadmaps, and recommendations that help executives understand RLHF trade-offs, risks, and opportunities
What we offer
What we offer
  • A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
  • A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
  • Stock-based long-term incentives
  • Award-winning time-off plans
  • Flexible work models where possible
Read More
Arrow Right

Machine Learning Research Associate

The Machine Learning research team at Hewlett Packard Labs seeks highly motivate...
Location
Location
United States , Milpitas
Salary
Salary:
43.27 - 93.15 USD / Hour
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Pursuing a Ph.D. degree (with significant research and innovation experience) in a relevant discipline (e.g. machine learning, computer science, electrical engineering, statistics, etc.)
  • Track record of world-class innovative contributions and ideas in machine learning
  • Experience in deep learning, LLM, Agentic AI, and reinforcement learning research
  • Experience in developing deep learning software with high proficiency in data structures and algorithms
  • Experience in Machine Learning frameworks like PyTorch - required
  • Strong programming skills and experience with Python
  • Software development experience in Deep Learning, GPU acceleration, and Model Optimization
  • Demonstrated effective communication and collaboration skills
  • Demonstrated ability for original research papers published in top-tier conferences or journals.
Job Responsibility
Job Responsibility
  • Provide thought leadership and technical influence both internally and externally to HPE
  • Work on cutting-edge machine learning research focusing on Large Language Models, Agentic AI, and Reinforcement Learning
  • Contribute along the full range from initial novel ideas to design, development, implementation, evaluation, and technology transfer
  • Publish in top AI conferences and workshops, including NeurIPS, AAAI, and ICML.
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Machine Learning Research Scientist

This role focuses on cutting-edge research and development in Artificial Intelli...
Location
Location
United States , Milpitas
Salary
Salary:
117500.00 - 270000.00 USD / Year
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD in Computer Science, Electrical Engineering, or related fields focusing on Machine Learning for the dissertation
  • extensive experience in deep learning research, preferably in Large Language Models or Reinforcement Learning
  • experience developing applications with deep learning frameworks like PyTorch with a high software proficiency
  • strong programming skills in Python, data structures, and algorithms are required
  • experience with ML model optimization, GPU acceleration, heterogeneous computation, system software, and performance optimization desired
  • experience in Python Web Frameworks – Django, Flask - a plus but not required.
Job Responsibility
Job Responsibility
  • conducting research, developing solutions, and creating intellectual property in emerging fields like reinforcement learning, LLMs, digital twins, clean energy, data center optimization, and sustainability
  • developing advanced technologies for analysis, optimization, time series forecasting, uncertainty quantification, and control
  • providing thought leadership, collaborating internally and externally, and contributing to HPE’s strategy by identifying emerging technologies
  • publishing in top conferences like NeurIPS, AAAI, and ACL
  • developing patent applications
  • software development, GPU acceleration, model optimization, and real-time data streaming to create robust AI solutions for real-world use cases.
What we offer
What we offer
  • a competitive salary and extensive social benefits
  • diverse and dynamic work environment
  • work-life balance and support for career development
  • health and wellbeing programs
  • personal and professional development programs
  • diversity, inclusion, and belonging initiatives.
  • Fulltime
Read More
Arrow Right

Engineering Director

We are seeking a seasoned Engineering Director who thrives in challenging and fa...
Location
Location
Puerto Rico , Aguadilla
Salary
Salary:
Not provided
https://www.hpe.com/ Logo
Hewlett Packard Enterprise
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Significant work experience as a director or similar position working across multiple stakeholder organizations, with at least 10+ years of people leadership experience specific to SW and Cloud engineering
  • Solid experience leading SW development across storage, networking, on-prem, and SaaS is a must
  • Experience in setting up geographically distributed sites
  • Must have a strong background in software development lifecycle including cloud infrastructure
  • Familiarity with agile methodologies and tools like JIRA
  • Prior experience in cloud product development and deployments
  • end to end ownership and accountability
  • Solid understanding of fundamental AI and machine learning concepts, including supervised and unsupervised learning, deep learning, reinforcement learning, natural language processing, computer vision, and statistical modeling
  • Extensive business acumen, technical knowledge, and industry experience encompassing one or more engineering, technology, and product domains
  • Demonstrated abilities to drive transformation across a business with exceptional skills in the management of change
Job Responsibility
Job Responsibility
  • Oversee the Puerto Rico Site daily operations, strategic planning and cross-functional team leadership for Hybrid Cloud
  • Recruit, mentor, and manage teams of AI/ML engineers, QA Engineers, Design Engineers and innovation specialists to deliver cutting-edge solutions
  • Continuously evaluate new tools, platforms, and frameworks in AI/ML to drive competitive advantage and operational efficiency
  • Ensure alignment with corporate goals while fostering a high-performance culture, operational efficiency, and employee engagement
  • Lead the development and execution of AI/ML strategies that align with business goals and drive innovation across products, services, or operations
  • Create strategic and tactical operations and resource plans, goals, and priorities for assigned organization based on business and technology roadmap and functional objectives
  • Engage with various senior leaders across the organization, program managers, R&D, support, Quality, product managers, technical leaders and executives to communicate program status, escalate issues, and guide and influence strategic decision-making
  • Manage senior relationships and escalated issues with outsourced partners and suppliers, including setting expectations regarding deliverables, product quality, schedules, and costs
  • ensures that organization is effectively leveraging outsourced resources
  • Identify opportunities for and drive organizational initiatives and programs to support business process improvements and cost reductions
What we offer
What we offer
  • Health & Wellbeing
  • Personal & Professional Development
  • Unconditional Inclusion
  • Fulltime
Read More
Arrow Right

Research Scientist Intern, Reinforcement Learning

We’re looking for a curious and motivated Reinforcement Learning Intern to help ...
Location
Location
United Kingdom , London
Salary
Salary:
Not provided
wayve.ai Logo
Wayve
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • Currently pursuing a PhD or Masters in Computer Science, Robotics, Electrical Engineering, or a related field, with a focus on Machine Learning, AI, or Computer Vision
  • Experience in research in Reinforcement Learning
  • Interest in one or more: synthetic data, representation learning, and Offline RL
  • Comfortable working in Python and libraries like PyTorch, NumPy, and Pandas
  • A principled mindset: you enjoy brainstorming, making assumptions, building, testing, and iterating on ideas to see what works
Job Responsibility
Job Responsibility
  • Help advance the next generation of decision-making systems for autonomous driving
  • Work embedded in a research team to develop scalable RL algorithms that enable vehicles to learn complex behaviors directly from experience — both in simulation and the real world
What we offer
What we offer
  • Competitive compensation and benefits
  • A dynamic and fast-paced work environment in which you will grow every day - learning on the job, from the a diverse team of the brightest researchers and engineers in this space
  • A culture that is ego-free, respectful and welcoming
  • Potential to publish your research work at a top flight conference
  • The chance to be part of a truly mission driven organisation and an opportunity to shape the future of autonomous driving
Read More
Arrow Right

PhD Autonomy Engineer Intern - Planning & Controls (Reinforcement Learning)

Skydio builds the world’s most advanced autonomous drones used across inspection...
Location
Location
Switzerland , Zurich
Salary
Salary:
50.00 EUR / Hour
skydio.com Logo
Skydio
Expiration Date
Until further notice
Flip Icon
Requirements
Requirements
  • PhD student in Robotics, Machine Learning, Controls, or related field
  • Strong fundamentals in RL, control theory, and motion planning
  • comfort with safety/robustness concepts
  • Proficient in Python (PyTorch/JAX/Ray RLlib) and at least one of C++ or CUDA
  • Hands-on experience with robotics simulation (Isaac Lab/MuJoCo/PyBullet) and sim2real techniques
  • Experience training/deploying policies for navigation, manipulation, or locomotion on real robots or autonomous vehicles
Job Responsibility
Job Responsibility
  • Develop and deploy reinforcement learning (and adjacent policy-learning methods) that make Skydio aircraft plan, navigate, and control themselves more intelligently—safely, reliably, and efficiently—across our ecosystem: handheld apps, ground control, cloud autonomy services, and fleet workflows
  • Navigation & avoidance in the wild: Train policies that adapt online to cluttered 3D scenes (forests, bridges, urban canyons), complementing our geometric stack for robust obstacle avoidance and dynamic goal-seeking
  • RL-augmented planning: Fuse learned cost shaping / value functions with trajectory optimization for smooth, agile flight with tight safety envelopes and mission constraints
  • Sim → Real at scale: Build scalable datasets and training loops with Isaac Lab, domain randomization, residual learning, and safety filters
  • validate on real drones weekly
  • Human-in-the-loop shared control: Learn assistive policies that blend pilot intent, autonomy priors, and uncertainty-aware behaviors for intuitive control handoffs
  • Fleet & multi-agent: Explore decentralized coordination for coverage, pursuit, and collaborative mapping with minimal comms
Read More
Arrow Right