Abstract
In this article, a phased reinforcement learning algorithm for controlling complex systems is proposed. The key element of the proposed algorithm is a shaping function defined on a novel position-direction space. The shaping function is autonomously constructed once the goal is reached, and constrains the exploration area to optimize the policy. The efficiency of the proposed shaping function was demonstrated by using a complex control problem of positioning a 2-link planar underactuated manipulator.
Original language | English |
---|---|
Pages (from-to) | 190-196 |
Number of pages | 7 |
Journal | Artificial Life and Robotics |
Volume | 11 |
Issue number | 2 |
DOIs | |
Publication status | Published - 2007 Jul |
Keywords
- Human exploration-exploitation strategy
- Promising zone
- Reinforcement learning
- Shaping function
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Artificial Intelligence