Generalised discount functions applied to a Monte-Carlo AL implementation

Sean Lamont, John Aslanides, Jan Leike, Marcus Hutter

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    5 Citations (Scopus)

    Abstract

    In recent years, work has been done to develop the theory of General Reinforcement Learning (GRL). However, there are no examples demonstrating the known results regarding generalised discounting. We have added to the GRL simulation platform (AIXIjs) the functionality to assign an agent arbitrary discount functions, and an environment which can be used to determine the effect of discounting on an agent's policy. Using this, we investigate how geometric, hyperbolic and power discounting affect an informed agent in a simple M DP. We experimentally reproduce a number of theoretical results, and discuss some related subtleties. It was found that the agent's behaviour followed what is expected theoretically, assuming appropriate parameters were chosen for the Monte-Carlo Tree Search (MOTS) planning algorithm.

    Original languageEnglish
    Title of host publication16th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2017
    EditorsEdmund Durfee, Michael Winikoff, Kate Larson, Sanmay Das
    PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
    Pages1589-1591
    Number of pages3
    ISBN (Electronic)9781510855076
    Publication statusPublished - 2017
    Event16th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2017 - Sao Paulo, Brazil
    Duration: 8 May 201712 May 2017

    Publication series

    NameProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
    Volume3
    ISSN (Print)1548-8403
    ISSN (Electronic)1558-2914

    Conference

    Conference16th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2017
    Country/TerritoryBrazil
    CitySao Paulo
    Period8/05/1712/05/17

    Fingerprint

    Dive into the research topics of 'Generalised discount functions applied to a Monte-Carlo AL implementation'. Together they form a unique fingerprint.

    Cite this