Reinforcement learning for balancer embedded humanoid locomotion

Akihiko Yamaguchi, Sang Ho Hyon, Tsukasa Ogasawara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

Reinforcement learning (RL) applications in robotics are of great interest because of their wide applicability, however many RL applications suffer from large learning costs. We study a new learning-walking scheme where a humanoid robot is embedded with a primitive balancing controller for safety. In this paper, we investigate some RL methods for the walking task. The system has two modes: double stance and single stance, and the selectable action spaces (sub-action spaces) change according to the mode. Thus, a hierarchical RL and a function approximator (FA) approaches are compared in simulation. To handle the sub-action spaces, we introduce the structured FA. The results demonstrate that non-hierarchical RL algorithms with the structured FA is much faster than the hierarchical RL algorithm. The robot can obtain appropriate walking gaits in around 30 episodes (20∼30 min), which is considered to be applicable to a real humanoid robot.

Original languageEnglish
Title of host publication2010 10th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2010
Pages308-313
Number of pages6
DOIs
Publication statusPublished - 2010
Externally publishedYes
Event2010 10th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2010 - Nashville, TN, United States
Duration: 2010 Dec 62010 Dec 8

Publication series

Name2010 10th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2010

Other

Other2010 10th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2010
Country/TerritoryUnited States
CityNashville, TN
Period10/12/610/12/8

ASJC Scopus subject areas

  • Artificial Intelligence
  • Hardware and Architecture
  • Human-Computer Interaction

Fingerprint

Dive into the research topics of 'Reinforcement learning for balancer embedded humanoid locomotion'. Together they form a unique fingerprint.

Cite this