Asymptotic learnability of reinforcement problems with arbitrary dependence

Daniil Ryabko*, Marcus Hutter

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO) MDPs. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.

Original languageEnglish
Title of host publicationAlgorithmic Learning Theory - 17th International Conference, ALT 2006, Proceedings
PublisherSpringer Verlag
Pages334-347
Number of pages14
ISBN (Print)3540466495, 9783540466499
Publication statusPublished - 2006
Externally publishedYes
Event17th International Conference on Algorithmic Learning Theory, ALT 2006 - Barcelona, Spain
Duration: 7 Oct 200610 Oct 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4264 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference17th International Conference on Algorithmic Learning Theory, ALT 2006
Country/TerritorySpain
CityBarcelona
Period7/10/0610/10/06

Fingerprint

Dive into the research topics of 'Asymptotic learnability of reinforcement problems with arbitrary dependence'. Together they form a unique fingerprint.

Cite this