TY - GEN
T1 - Differential dynamic programming for graph-structured dynamical systems
T2 - 16th IEEE-RAS International Conference on Humanoid Robots, Humanoids 2016
AU - Yamaguchi, Akihiko
AU - Atkeson, Christopher G.
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/12/30
Y1 - 2016/12/30
N2 - We explore differential dynamic programming for dynamical systems that form a directed graph structure. This planning method is applicable to complicated tasks where sub-tasks are sequentially connected and different skills are selected according to the situation. A pouring task is an example: it involves grasping and moving a container, and selection of skills, e.g. tipping and shaking. Our method can handle these situations; we plan the continuous parameters of each subtask and skill, as well as select skills. Our method is based on stochastic differential dynamic programming. We use stochastic neural networks to learn dynamical systems when they are unknown. Our method is a form of reinforcement learning. On the other hand, we use ideas from artificial intelligence, such as graph-structured dynamical systems, and frame-and-slots to represent a large state-action vector. This work is a partial unification of these different fields. We demonstrate our method in a simulated pouring task, where we show that our method generalizes over material property and container shape. Accompanying video: https://youtu.be/-ECmnG2BLE8.
AB - We explore differential dynamic programming for dynamical systems that form a directed graph structure. This planning method is applicable to complicated tasks where sub-tasks are sequentially connected and different skills are selected according to the situation. A pouring task is an example: it involves grasping and moving a container, and selection of skills, e.g. tipping and shaking. Our method can handle these situations; we plan the continuous parameters of each subtask and skill, as well as select skills. Our method is based on stochastic differential dynamic programming. We use stochastic neural networks to learn dynamical systems when they are unknown. Our method is a form of reinforcement learning. On the other hand, we use ideas from artificial intelligence, such as graph-structured dynamical systems, and frame-and-slots to represent a large state-action vector. This work is a partial unification of these different fields. We demonstrate our method in a simulated pouring task, where we show that our method generalizes over material property and container shape. Accompanying video: https://youtu.be/-ECmnG2BLE8.
UR - http://www.scopus.com/inward/record.url?scp=85010218193&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85010218193&partnerID=8YFLogxK
U2 - 10.1109/HUMANOIDS.2016.7803398
DO - 10.1109/HUMANOIDS.2016.7803398
M3 - Conference contribution
AN - SCOPUS:85010218193
T3 - IEEE-RAS International Conference on Humanoid Robots
SP - 1029
EP - 1036
BT - Humanoids 2016 - IEEE-RAS International Conference on Humanoid Robots
PB - IEEE Computer Society
Y2 - 15 November 2016 through 17 November 2016
ER -