The structure of version space

Ralf Herbrich*, Thore Graepel, Robert C. Williamson

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingChapterpeer-review

Abstract

We investigate the generalisation performance of consistent classifiers, i.e. classifiers that are contained in the so-called version space, both from a theoretical and experimental angle. In contrast to classical VC analysis - where no single classifier within version space is singled out on grounds of a generalisation error bound - the data dependent structural risk minimisation framework suggests that there exists one particular classifier that is to be preferred because it minimises the generalisation error bound. This is usually taken to provide a theoretical justification for learning algorithms such as the well known support vector machine. A reinterpretation of a recent PAC-Bayesian result, however, reveals that given a suitably chosen hypothesis space there exists a large fraction of classifiers with small generalisation error albeit we cannot identify them for a specific learning task. In the particular case of linear classifiers we show that classifiers found by the classical perceptron algorithm have guarantees bounded by the size of version space. These results are complemented with an empirical study for kernel classifiers on the task of handwritten digit recognition which demonstrates that even classifiers with a small margin may exhibit excellent generalisation. In order to perform this analysis we introduce the kernel Gibbs sampler - an algorithm which can be used to sample consistent kernel classifiers.

Original languageEnglish
Title of host publicationInnovations in Machine Learning
Subtitle of host publicationTheory and Applications
EditorsDawn Holmes, Lakhmi Jain
Pages257-273
Number of pages17
DOIs
Publication statusPublished - 2006
Externally publishedYes

Publication series

NameStudies in Fuzziness and Soft Computing
Volume194
ISSN (Print)1434-9922

Fingerprint

Dive into the research topics of 'The structure of version space'. Together they form a unique fingerprint.

Cite this