Convergence and loss bounds for Bayesian sequence prediction

Marcus Hutter*

*Corresponding author for this work

Research output: Contribution to journalLetterpeer-review

25 Citations (Scopus)

Abstract

The probability of observing xt at time t, given past observations x1 ⋯ xt-1 can be computed if the true generating distribution μ of the sequences x1x2x3 ⋯ is known. If μ is unknown, but known to belong to a class M one can base one's prediction on the Bayes mix ξ defined as a weighted sum of distributions ν ε M. Various convergence results of the mixture posterior ξt to the true posterior μt, are presented. In particular, a new (elementary) derivation of the convergence ξt / μt → 1 is provided, which additionally gives the rate of convergence. A general sequence predictor is allowed to choose an action yt based on x1 ⋯ xt-1 and receives loss ℓxtyt if xt is the next symbol of the sequence. No assumptions are made on the structure of ℓ (apart from being bounded) and M. The Bayes-optimal prediction scheme Λξ based on mixture ξ and the Bayes-optimal informed prediction scheme Λμ are defined and the total loss Lξ of Λξ is bounded in terms of the total loss Lμ of Λμ. It is shown that Lξ is bounded for bounded Lμ and Lξ/Lμ → 1 for Lμ → ∞. Convergence of the instantaneous losses is also proven.

Original languageEnglish
Pages (from-to)2061-2067
Number of pages7
JournalIEEE Transactions on Information Theory
Volume49
Issue number8
DOIs
Publication statusPublished - Aug 2003
Externally publishedYes

Fingerprint

Dive into the research topics of 'Convergence and loss bounds for Bayesian sequence prediction'. Together they form a unique fingerprint.

Cite this