TY - GEN
T1 - Constructing continuous action space from basis functions for fast and stable reinforcement learning
AU - Yamaguchi, Akihiko
AU - Takamatsu, Jun
AU - Ogasawara, Tsukasa
PY - 2009
Y1 - 2009
N2 - This paper presents a new continuous action space for reinforcement learning (RL) with the wire-fitting [1]. The wire-fitting has a desirable feature to be used with action value function based RL algorithms. However, the wire-fitting becomes unstable caused by changing the parameters of actions. Furthermore, the acquired behavior highly depend on the initial values of the parameters. The proposed action space is expanded from the DCOB, proposed by Yamaguchi et al. [2], where the discrete action set is generated from given basis functions. Based on the DCOB, we apply some constraints to the parameters in order to obtain stability. Furthermore, we also describe a proper way to initialize the parameters. The simulation results demonstrate that the proposed method outperforms the wire-fitting. On the other hand, the resulting performance of the proposed method is the same as, or inferior to the DCOB. This paper also discuss about this result.
AB - This paper presents a new continuous action space for reinforcement learning (RL) with the wire-fitting [1]. The wire-fitting has a desirable feature to be used with action value function based RL algorithms. However, the wire-fitting becomes unstable caused by changing the parameters of actions. Furthermore, the acquired behavior highly depend on the initial values of the parameters. The proposed action space is expanded from the DCOB, proposed by Yamaguchi et al. [2], where the discrete action set is generated from given basis functions. Based on the DCOB, we apply some constraints to the parameters in order to obtain stability. Furthermore, we also describe a proper way to initialize the parameters. The simulation results demonstrate that the proposed method outperforms the wire-fitting. On the other hand, the resulting performance of the proposed method is the same as, or inferior to the DCOB. This paper also discuss about this result.
KW - Continuous action space
KW - Crawling
KW - Jumping
KW - Motion learning
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=72849148517&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=72849148517&partnerID=8YFLogxK
U2 - 10.1109/ROMAN.2009.5326234
DO - 10.1109/ROMAN.2009.5326234
M3 - Conference contribution
AN - SCOPUS:72849148517
SN - 9781424450817
T3 - Proceedings - IEEE International Workshop on Robot and Human Interactive Communication
SP - 401
EP - 407
BT - RO-MAN 2009 - 18th IEEE International Symposium on Robot and Human Interactive
T2 - 18th IEEE International Symposium on Robot and Human Interactive, RO-MAN 2009
Y2 - 27 September 2009 through 2 October 2009
ER -