TY - GEN
T1 - Discrete MDL predicts in total variation
AU - Hutter, Marcus
PY - 2009
Y1 - 2009
N2 - The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance. Implications for non-i.i.d. domains like time-series forecasting, discriminative learning, and reinforcement learning are discussed.
AB - The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model. We show that for a countable class of models, MDL predictions are close to the true distribution in a strong sense. The result is completely general. No independence, ergodicity, stationarity, identifiability, or other assumption on the model class need to be made. More formally, we show that for any countable class of models, the distributions selected by MDL (or MAP) asymptotically predict (merge with) the true measure in the class in total variation distance. Implications for non-i.i.d. domains like time-series forecasting, discriminative learning, and reinforcement learning are discussed.
UR - http://www.scopus.com/inward/record.url?scp=84858716044&partnerID=8YFLogxK
M3 - Conference contribution
SN - 9781615679119
T3 - Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference
SP - 817
EP - 825
BT - Advances in Neural Information Processing Systems 22 - Proceedings of the 2009 Conference
PB - Neural Information Processing Systems
T2 - 23rd Annual Conference on Neural Information Processing Systems, NIPS 2009
Y2 - 7 December 2009 through 10 December 2009
ER -