Differential dynamic programming with temporally decomposed dynamics

Akihiko Yamaguchi, Christopher G. Atkeson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Citations (Scopus)

Abstract

We explore a temporal decomposition of dynamics in order to enhance policy learning with unknown dynamics. There are model-free methods and model-based methods for policy learning with unknown dynamics, but both approaches have problems: in general, model-free methods have less generalization ability, while model-based methods are often limited by the assumed model structure or need to gather many samples to make models. We consider a temporal decomposition of dynamics to make learning models easier. To obtain a policy, we apply differential dynamic programming (DDP). A feature of our method is that we consider decomposed dynamics even when there is no action to be taken, which allows us to decompose dynamics more flexibly. Consequently learned dynamics become more accurate. Our DDP is a first-order gradient descent algorithm with a stochastic evaluation function. In DDP with learned models, typically there are many local maxima. In order to avoid them, we consider multiple criteria evaluation functions. In addition to the stochastic evaluation function, we use a reference value function. This method was verified with pouring simulation experiments where we created complicated dynamics. The results show that we can optimize actions with DDP while learning dynamics models.

Original languageEnglish
Title of host publicationHumanoids 2015
Subtitle of host publicationHumanoids in the New Media Age - IEEE RAS International Conference on Humanoid Robots
PublisherIEEE Computer Society
Pages696-703
Number of pages8
ISBN (Electronic)9781479968855
DOIs
Publication statusPublished - 2015 Dec 22
Externally publishedYes
Event15th IEEE RAS International Conference on Humanoid Robots, Humanoids 2015 - Seoul, Korea, Republic of
Duration: 2015 Nov 32015 Nov 5

Publication series

NameIEEE-RAS International Conference on Humanoid Robots
Volume2015-December
ISSN (Print)2164-0572
ISSN (Electronic)2164-0580

Other

Other15th IEEE RAS International Conference on Humanoid Robots, Humanoids 2015
Country/TerritoryKorea, Republic of
CitySeoul
Period15/11/315/11/5

Keywords

  • Computational modeling
  • Containers
  • Dynamic programming
  • Heuristic algorithms
  • Optimization
  • Robots
  • Stochastic processes

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Vision and Pattern Recognition
  • Hardware and Architecture
  • Human-Computer Interaction
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Differential dynamic programming with temporally decomposed dynamics'. Together they form a unique fingerprint.

Cite this