TY - GEN
T1 - Prediction for control delay on reinforcement learning
AU - Saito, Junya
AU - Narisawa, Kazuyuki
AU - Shinohara, Ayumi
PY - 2012/6/15
Y1 - 2012/6/15
N2 - This paper addresses reinforcement learning problems with constant control delay, both for known case and unknown case. First, we propose an algorithm for known delay, which is a simple extension of the model-free learning algorithm introduced by (Schuitema et al., 2010). We extend it to predict current states explicitly, and empirically show that it is more efficient than existing algorithms. Next, we consider the case that the delay is unknown but its maximum value is bounded. We propose an algorithm using accuracy of prediction of states for this case. We show that the algorithm performs as efficient as the one which knows the real delay.
AB - This paper addresses reinforcement learning problems with constant control delay, both for known case and unknown case. First, we propose an algorithm for known delay, which is a simple extension of the model-free learning algorithm introduced by (Schuitema et al., 2010). We extend it to predict current states explicitly, and empirically show that it is more efficient than existing algorithms. Next, we consider the case that the delay is unknown but its maximum value is bounded. We propose an algorithm using accuracy of prediction of states for this case. We show that the algorithm performs as efficient as the one which knows the real delay.
KW - Control delay
KW - Machine learning
KW - Markov decision process
KW - Reinforcement learning
UR - http://www.scopus.com/inward/record.url?scp=84862136772&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862136772&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84862136772
SN - 9789898425959
T3 - ICAART 2012 - Proceedings of the 4th International Conference on Agents and Artificial Intelligence
SP - 579
EP - 586
BT - ICAART 2012 - Proceedings of the 4th International Conference on Agents and Artificial Intelligence
T2 - 4th International Conference on Agents and Artificial Intelligence, ICAART 2012
Y2 - 6 February 2012 through 8 February 2012
ER -