TY - GEN
T1 - Gradient based algorithms with loss functions and kernels for improved on-policy control
AU - Robards, Matthew
AU - Sunehag, Peter
PY - 2012
Y1 - 2012
N2 - We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these algorithms are studied in a companion paper.
AB - We introduce and empirically evaluate two novel online gradient-based reinforcement learning algorithms with function approximation - one model based, and the other model free. These algorithms come with the possibility of having non-squared loss functions which is novel in reinforcement learning, and seems to come with empirical advantages. We further extend a previous gradient based algorithm to the case of full control, by using generalized policy iteration. Theoretical properties of these algorithms are studied in a companion paper.
UR - http://www.scopus.com/inward/record.url?scp=84861701646&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-29946-9_7
DO - 10.1007/978-3-642-29946-9_7
M3 - Conference contribution
AN - SCOPUS:84861701646
SN - 9783642299452
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 30
EP - 41
BT - Recent Advances in Reinforcement Learning - 9th European Workshop, EWRL 2011, Revised Selected Papers
T2 - 9th European Workshop on Reinforcement Learning, EWRL 2011
Y2 - 9 September 2011 through 11 September 2011
ER -