TY - GEN
T1 - Avoiding wireheading with value reinforcement learning
AU - Everitt, Tom
AU - Hutter, Marcus
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2016.
PY - 2016
Y1 - 2016
N2 - How can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) may seem like a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward – the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to learn a utility function. The VRL setup allows us to remove the incentive to wirehead by placing a constraint on the agent’s actions. The constraint is defined in terms of the agent’s belief distributions, and does not require an explicit specification of which actions constitute wireheading.
AB - How can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) may seem like a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward – the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to learn a utility function. The VRL setup allows us to remove the incentive to wirehead by placing a constraint on the agent’s actions. The constraint is defined in terms of the agent’s belief distributions, and does not require an explicit specification of which actions constitute wireheading.
UR - http://www.scopus.com/inward/record.url?scp=84977502866&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-41649-6_2
DO - 10.1007/978-3-319-41649-6_2
M3 - Conference contribution
SN - 9783319416489
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 12
EP - 22
BT - Artificial General Intelligence - 9th International Conference, AGI 2016, Proceedings
A2 - Steunebrink, Bas
A2 - Wang, Pei
A2 - Goertzel, Ben
PB - Springer Verlag
T2 - 9th International Conference on Artificial General Intelligence, AGI 2016
Y2 - 16 July 2016 through 19 July 2016
ER -