A formal solution to the grain of truth problem

Jan Leike, Jessica Taylor, Benya Fallenstein

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    5 Citations (Scopus)

    Abstract

    A Bayesian agent acting in a multi-agent environment learns to predict the other agents' policies if its prior assigns positive probability to them (in other words, its prior contains a grain of truth). Finding a reasonably large class of policies that contains the Bayes-optimal policies with respect to this class is known as the grain of truth problem. Only small classes are known to have a grain of truth and the literature contains several related impossibility results. In this paper we present a formal and general solution to the full grain of truth problem: we construct a class of policies that contains all computable policies as well as Bayes-optimal policies for every lower semicomputable prior over the class. When the environment is unknown, Bayes-optimal agents may fail to act optimally even asymptotically. However, agents based on Thompson sampling converge to play "-Nash equilibria in arbitrary unknown computable multi-agent environments. While these results are purely theoretical, we show that they can be computationally approximated arbitrarily closely.

    Original languageEnglish
    Title of host publication32nd Conference on Uncertainty in Artificial Intelligence 2016, UAI 2016
    EditorsDominik Janzing, Alexander Ihler
    PublisherAssociation For Uncertainty in Artificial Intelligence (AUAI)
    Pages427-436
    Number of pages10
    ISBN (Electronic)9781510827806
    Publication statusPublished - 2016
    Event32nd Conference on Uncertainty in Artificial Intelligence 2016, UAI 2016 - Jersey City, United States
    Duration: 25 Jun 201629 Jun 2016

    Publication series

    Name32nd Conference on Uncertainty in Artificial Intelligence 2016, UAI 2016

    Conference

    Conference32nd Conference on Uncertainty in Artificial Intelligence 2016, UAI 2016
    Country/TerritoryUnited States
    CityJersey City
    Period25/06/1629/06/16

    Fingerprint

    Dive into the research topics of 'A formal solution to the grain of truth problem'. Together they form a unique fingerprint.

    Cite this