TY - JOUR
T1 - Probabilistic models of vision and max-margin methods
AU - Yuille, Alan
AU - He, Xuming
PY - 2012/3
Y1 - 2012/3
N2 - It is attractive to formulate problems in computer vision and related fields in term of probabilistic estimation where the probability models are defined over graphs, such as grammars. The graphical structures, and the state variables defined over them, give a rich knowledge representation which can describe the complex structures of objects and images. The probability distributions defined over the graphs capture the statistical variability of these structures. These probability models can be learnt from training data with limited amounts of supervision. But learning these models suffers from the difficulty of evaluating the normalization constant, or partition function, of the probability distributions which can be extremely computationally demanding. This paper shows that by placing bounds on the normalization constant we can obtain computationally tractable approximations. Surprisingly, for certain choices of loss functions, we obtain many of the standard max-margin criteria used in support vector machines (SVMs) and hence we reduce the learning to standard machine learning methods. We show that many machine learning methods can be obtained in this way as approximations to probabilistic methods including multi-class max-margin, ordinal regression, max-margin Markov networks and parsers, multiple-instance learning, and latent SVM. We illustrate this work by computer vision applications including image labeling, object detection and localization, and motion estimation. We speculate that better results can be obtained by using better bounds and approximations.
AB - It is attractive to formulate problems in computer vision and related fields in term of probabilistic estimation where the probability models are defined over graphs, such as grammars. The graphical structures, and the state variables defined over them, give a rich knowledge representation which can describe the complex structures of objects and images. The probability distributions defined over the graphs capture the statistical variability of these structures. These probability models can be learnt from training data with limited amounts of supervision. But learning these models suffers from the difficulty of evaluating the normalization constant, or partition function, of the probability distributions which can be extremely computationally demanding. This paper shows that by placing bounds on the normalization constant we can obtain computationally tractable approximations. Surprisingly, for certain choices of loss functions, we obtain many of the standard max-margin criteria used in support vector machines (SVMs) and hence we reduce the learning to standard machine learning methods. We show that many machine learning methods can be obtained in this way as approximations to probabilistic methods including multi-class max-margin, ordinal regression, max-margin Markov networks and parsers, multiple-instance learning, and latent SVM. We illustrate this work by computer vision applications including image labeling, object detection and localization, and motion estimation. We speculate that better results can be obtained by using better bounds and approximations.
KW - loss function
KW - max-margin learning
KW - probabilistic models
KW - structured prediction
UR - http://www.scopus.com/inward/record.url?scp=84863412765&partnerID=8YFLogxK
U2 - 10.1007/s11460-012-0170-6
DO - 10.1007/s11460-012-0170-6
M3 - Article
SN - 1673-3460
VL - 7
SP - 94
EP - 106
JO - Frontiers of Electrical and Electronic Engineering in China
JF - Frontiers of Electrical and Electronic Engineering in China
IS - 1
ER -