Reinforcement Learning Engineer Job at Wiremind (Paris)

Research Engineer, Reinforcement Learning

As a Research Engineer specializing in Reinforcement Learning, you will be respo...

Location

United States , Palo Alto

Salary:

180000.00 - 250000.00 USD / Year

1X Technologies

Expiration Date

Until further notice

Requirements

Strong programming experience in Python and/or C++
Proficiency with PyTorch
Hands-on experience with simulation platforms like Isaac Sim or MuJoCo
Experience training reinforcement learning policies, particularly for manipulation or locomotion
Ability to collaborate cross-functionally with hardware, control, data, and QA teams
Demonstrated experience addressing the sim-to-real gap

Job Responsibility

Own the full stack of engineering tasks: from data engineering and model architecture to delivering polished products
Train NEO on a wide variety of manipulation and locomotion tasks
Collaborate with hardware teams to bridge the sim-to-real gap for policies trained in simulation
Partner with controls, quality assurance, and data collection teams to ship RL policies to production
Deploy reinforcement learning-trained skills into real-world home environments

What we offer

Health, dental, and vision insurance
401(k) with company match
Paid time off and holidays

Fulltime

Senior Reinforcement Learning Engineer

Figure is an AI Robotics company developing a general purpose humanoid. Our Huma...

Location

United States , San Jose

Salary:

150000.00 - 400000.00 USD / Year

Figure

Expiration Date

Until further notice

Requirements

Confident writing production quality code in PyTorch
Familiar with online and offline reinforcement learning algorithms: PPO, SAC, etc.
Experience tuning hyperparameters and cost functions for these RL algorithms
Familiarity with common RL techniques such as: domain randomization, curriculum learning, reward shaping, etc.
Familiarity with general ML evaluation tools such as TensorBoard, Weights&Biases, etc.
Strong mix of industry and research experience, ideally 5-7+ years of experience

Job Responsibility

Develop, train, and deploy reinforcement learning algorithms for locomotion and manipulation tasks
Build simulation infrastructure to support the training of locomotion and manipulation policies for a general purpose humanoid robot at a large scale
Collaborate with the controls team to integrate policies into the existing control stack
Define, test, and evaluate performance metrics for learned policies

Fulltime

AI Research Engineer - Reinforcement Learning

At Helsing we deliver AI-based capabilities and the enabling infrastructure that...

Location

Germany , Munich

Salary:

Not provided

Helsing

Expiration Date

Until further notice

Requirements

Hold MSc in machine learning with a speciality in either reinforcement learning, multi-agent systems, automation and control, or robotics
Have excellent communication skills and the ability to report and present research findings clearly and efficiently both internally and externally
Are passionate about keeping up-to-date with current research and enjoy reimplementing / extending papers on state-of-the-art Deep Learning-based approaches
Possess solid software engineering skills, writing clean and well-structured code in Python and/or languages like Rust, Java, or modern C++, and experience deploying AI software to production including testing, QA, and monitoring

Job Responsibility

Design, train and deploy agents in complex multi-agent environments
Contribute to our reinforcement learning stack by implementing, improving and extending the current state of the art in multi-agent reinforcement learning
Be a part of impactful projects and will collaborate with people across several teams and backgrounds to integrate cutting edge ML/AI in our production systems

What we offer

Competitive compensation and stock options
Relocation support
Social and education allowances
Regular company events and all-hands to bring together employees as one team across Europe
A hands-on onboarding program (affectionately labelled “AI-duction”), in which you will be familiarising yourself with our tools and ML pipelines used across the company

Fulltime

Applied Research Lead, Reinforcement Learning

We are building AI to simulate the world through merging art and science. We bel...

Location

United States

Salary:

280000.00 - 380000.00 USD / Year

Runway

Expiration Date

Until further notice

Requirements

4+ years of relevant engineering or research experience in applying reinforcement learning to align language, image, and/or video generation models
Very strong programming skills and ability to write clean and maintainable research code
Deep interest in building human-in-the-loop systems for creativity
Passion for seeing research through from initial conception to eventual application
Experience mentoring and teaching other researchers
Strong communication, collaboration, and documentation skills

Job Responsibility

Lead efforts in applying reinforcement learning based techniques to improve the quality and controllability of the models that power Runway’s research and tools

Fulltime

Associate Director, Reinforcement Learning (ML)

Lead Amgen’s strategy and execution for Reinforcement Learning from Human Feedba...

Location

United States , Thousand Oaks; Jacksonville

Salary:

Not provided

Amgen

Expiration Date

Until further notice

Requirements

Doctorate degree and 3 years of Computer Science, IT or related field experience
Master’s degree and 5 years of Computer Science, IT or related field experience
Bachelor’s degree and 7 years of Computer Science, IT or related field experience
Associate’s degree and 12 years of Computer Science, IT or related field experience
High school diploma / GED and 14 years of Computer Science, IT or related field experience
Deep, hands-on expertise in Reinforcement Learning from Human Feedback (RLHF) and/or advanced reinforcement learning, including reward modeling, policy optimization, exploration strategies, and offline/online evaluation
Demonstrated experience deploying RLHF or RL systems into production for real-world applications (e.g., large language models, recommendation systems, decision support tools, or workflow automation), ideally in healthcare, life sciences, or other regulated domains
Strong background in modern machine learning and deep learning, with practical experience in Python and frameworks such as PyTorch or TensorFlow, and familiarity with LLM ecosystems and tooling
Experience driving sophisticated, cross-functional initiatives, collaborating with non-technical stakeholders (e.g., physicians, scientists, commercial leaders, compliance, legal) and translating needs into impactful AI solutions
Strong ability to communicate complex technical topics simply, tailoring content to senior executives and non-technical audiences

Job Responsibility

Lead the design and development of RLHF systems including reward modeling, policy optimization, safety and alignment mechanisms, and evaluation frameworks for large language models and other AI systems
Drive hands-on technical execution, particularly for high-impact projects, reviewing architectures, experimentation plans, and code, and helping the team navigate scientific and engineering trade-offs
Establish best-practice pipelines for human feedback, partnering closely with internal customer teams to define feedback protocols, annotation quality standards, and governance for RLHF data
Define and track success metrics for RLHF systems, balancing offline and online evaluation, A/B tests, safety and robustness criteria, and business or scientific outcomes
Collaborate across Amgen leaders to ensure RLHF solutions are aligned with strategy, compliant with policy, and integrated into real workflows
Partner with Data, Platform and Technology teams to ensure that RLHF workloads are supported by scalable data platforms, model hosting, experimentation infrastructure, and MLOps best practices
Champion responsible and compliant AI, working with Legal, Compliance, and Information Security to implement governance around human feedback, data usage, model behavior, transparency, and risk management in a regulated environment
Communicate insights and influence senior stakeholders, creating clear narratives, roadmaps, and recommendations that help executives understand RLHF trade-offs, risks, and opportunities

What we offer

A comprehensive employee benefits package, including a Retirement and Savings Plan with generous company contributions, group medical, dental and vision coverage, life and disability insurance, and flexible spending accounts
A discretionary annual bonus program, or for field sales representatives, a sales-based incentive plan
Stock-based long-term incentives
Award-winning time-off plans
Flexible work models where possible

Bike sharing system rebalancing by reinforcement learning algorithms

This internship project focuses on a specific component of a broader initiative ...

Location

France , Lyon

Salary:

Not provided

ABG - Association Bernard Gregory

Expiration Date

Until further notice

Requirements

Master’s student (M2) or in the final year of an Engineering School program
Background in Computational Mechanics, Applied Mathematics, or Data Science
Knowledge and experience in numerical modeling and simulation of physical or dynamical systems
Knowledge and experience in machine learning or statistical data analysis
Knowledge and experience in time series forecasting and spatio-temporal modeling
Knowledge and experience in optimization and/or reinforcement learning methods
Programming skills in Python (preferred), including libraries such as NumPy, Pandas, PyTorch, or TensorFlow
Data visualization and exploratory data analysis
Familiarity with version control tools (e.g., Git) and collaborative coding practices
Good written and oral communication skills in English

Job Responsibility

Develop predictive models to estimate short-term bicycle availability and demand at both the station and network levels using spatio-temporal data
Analyze and preprocess heterogeneous datasets, including trip records, station metadata, weather conditions, and temporal factors, to create robust inputs for modeling
Implement and compare different machine learning approaches (e.g., time series forecasting, graph neural networks, spatio-temporal models) to capture flow dynamics in the bikeshare system
Evaluate the performance and scalability of predictive algorithms under realistic conditions, using metrics relevant to operational decision-making in mobility systems
Provide data-driven inputs for the reinforcement learning module, enabling the development of adaptive and real-time rebalancing strategies in the second phase of the project
Integrate uncertainty quantification to assess the confidence of predictions and their impact on rebalancing decisions
Explore online or incremental learning techniques to enable continuous model adaptation as new data streams become available

Reinforcement learning intern

As a Reinforcement Learning Intern, you will help develop and implement learning...

Location

France , Paris

Salary:

Not provided

Enchanted Tools

Expiration Date

Until further notice

Requirements

BSc holder in Robotics, Engineering, Computer Science, or related field
Coursework or project experience in reinforcement learning or learning-based control
Strong Python skills and knowledge of a deep learning framework PyTorch, JAX, or TensorFlow
Familiarity with simulation environments such as Isaac Sim, Mujoco, or Gazebo
Solid analytical and problem-solving abilities

Job Responsibility

Develop, debug, and test reinforcement learning algorithms for locomotion and navigation on a dynamically balancing base
Extend simulation environments (Isaac Sim / Isaac Lab) to support training and evaluation of RL policies
Integrate trained policies into the Mirokai software stack and validate them on physical robots
Analyze performance, stability, and sim-to-real transfer aspects
Stay up to date with recent research in reinforcement learning for robotics

Thesis project: reinforcement learning environments for ai agents

Your thesis will connect to our ongoing work with Predli Studio and help shape h...

Location

Sweden , Stockholm

Salary:

Not provided

Predli

Expiration Date

Until further notice

Requirements

Enrolled in a master’s program in Machine Learning, AI, Data Science, Computer Science, or Engineering Physics (or a related field)
Curious, analytical, and eager to explore how AI can be applied in practice
Skilled in Python and Typescript
Confident in taking initiative, communicating ideas clearly, and working both independently and collaboratively
Excited to learn from and contribute to a small, high-impact team where knowledge sharing and experimentation are part of everyday life
Preferably based in Stockholm, with the possibility to work partly remote
Fluent in English

Job Responsibility

Focus on the development of advanced RL scenarios that challenge agent adaptability, generalization and decision-making under uncertainty, providing valuable insights into the capabilities and limitations of current RL approaches
Collaborate with engineers and researchers to define a clear scope that fits both academic requirements and ongoing applied AI work

What we offer

Work alongside experienced AI engineers and researchers who will collaborate with you throughout your thesis
Get access to real-world data, infrastructure, and insights from ongoing AI projects
Contribute directly to the development of Predli Studio and help shape how organizations build and deploy AI in practice
Be part of a collaborative environment where learning, curiosity, and knowledge sharing are valued and encouraged
Gain exposure to both the consulting and product sides of applied AI
Possibility to continue your journey with us after your thesis

Reinforcement Learning Engineer

Wiremind

Location:
France , Paris

Category:
IT - Software Development

Contract Type:
Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:
January 05, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Reinforcement Learning Engineer

Research Engineer, Reinforcement Learning

Senior Reinforcement Learning Engineer

AI Research Engineer - Reinforcement Learning

Applied Research Lead, Reinforcement Learning

Associate Director, Reinforcement Learning (ML)

Bike sharing system rebalancing by reinforcement learning algorithms

Reinforcement learning intern

Thesis project: reinforcement learning environments for ai agents

Reinforcement Learning Engineer

Wiremind

Location:France , Paris

Category:IT - Software Development

Contract Type:Not provided

Salary:

Job Description:

Job Responsibility:

Requirements:

Nice to have:

Additional Information:

Job Posted:January 05, 2026

Looking for more opportunities? Search for other job offers that match your skills and interests.

Similar Jobs for Reinforcement Learning Engineer

Research Engineer, Reinforcement Learning

Senior Reinforcement Learning Engineer

AI Research Engineer - Reinforcement Learning

Applied Research Lead, Reinforcement Learning

Associate Director, Reinforcement Learning (ML)

Bike sharing system rebalancing by reinforcement learning algorithms

Reinforcement learning intern

Thesis project: reinforcement learning environments for ai agents

Location:
France , Paris

Category:
IT - Software Development

Contract Type:
Not provided

Job Posted:
January 05, 2026