Neural networks and differential dynamic programming for reinforcement learning problems

Akihiko Yamaguchi, Christopher G. Atkeson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

We explore a model-based approach to reinforcement learning where partially or totally unknown dynamics are learned and explicit planning is performed. We learn dynamics with neural networks, and plan behaviors with differential dynamic programming (DDP). In order to handle complicated dynamics, such as manipulating liquids (pouring), we consider temporally decomposed dynamics. We start from our recent work [1] where we used locally weighted regression (LWR) to model dynamics. The major contribution of this paper is making use of deep learning in the form of neural networks with stochastic DDP, and showing the advantages of neural networks over LWR. For this purpose, we extend neural networks for: (1) modeling prediction error and output noise, (2) computing an output probability distribution for a given input distribution, and (3) computing gradients of output expectation with respect to an input. Since neural networks have nonlinear activation functions, these extensions were not easy. We provide an analytic solution for these extensions using some simplifying assumptions. We verified this method in pouring simulation experiments. The learning performance with neural networks was better than that of LWR. The amount of spilled materials was reduced. We also present early results of robot experiments using a PR2. Accompanying video: https://youtu.be/aM3hE1J5W98

Original languageEnglish
Title of host publication2016 IEEE International Conference on Robotics and Automation, ICRA 2016
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5434-5441
Number of pages8
ISBN (Electronic)9781467380263
DOIs
Publication statusPublished - 2016 Jun 8
Externally publishedYes
Event2016 IEEE International Conference on Robotics and Automation, ICRA 2016 - Stockholm, Sweden
Duration: 2016 May 162016 May 21

Publication series

NameProceedings - IEEE International Conference on Robotics and Automation
Volume2016-June
ISSN (Print)1050-4729

Other

Other2016 IEEE International Conference on Robotics and Automation, ICRA 2016
Country/TerritorySweden
CityStockholm
Period16/5/1616/5/21

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Neural networks and differential dynamic programming for reinforcement learning problems'. Together they form a unique fingerprint.

Cite this