TY - GEN
T1 - Optimization of robust loss functions for weakly-labeled image taxonomies
T2 - 8th International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition, EMMCVPR 2011
AU - McAuley, Julian J.
AU - Ramisa, Arnau
AU - Caetano, Tibério S.
PY - 2011
Y1 - 2011
N2 - The recently proposed ImageNet dataset consists of several million images, each annotated with a single object category. However, these annotations may be imperfect, in the sense that many images contain multiple objects belonging to the label vocabulary. In other words, we have a multi-label problem but the annotations include only a single label (and not necessarily the most prominent). Such a setting motivates the use of a robust evaluation measure, which allows for a limited number of labels to be predicted and, as long as one of the predicted labels is correct, the overall prediction should be considered correct. This is indeed the type of evaluation measure used to assess algorithm performance in a recent competition on ImageNet data. Optimizing such types of performance measures presents several hurdles even with existing structured output learning methods. Indeed, many of the current state-of-the-art methods optimize the prediction of only a single output label, ignoring this 'structure' altogether. In this paper, we show how to directly optimize continuous surrogates of such performance measures using structured output learning techniques with latent variables. We use the output of existing binary classifiers as input features in a new learning stage which optimizes the structured loss corresponding to the robust performance measure. We present empirical evidence that this allows us to 'boost' the performance of existing binary classifiers which are the state-of-the-art for the task of object classification in ImageNet.
AB - The recently proposed ImageNet dataset consists of several million images, each annotated with a single object category. However, these annotations may be imperfect, in the sense that many images contain multiple objects belonging to the label vocabulary. In other words, we have a multi-label problem but the annotations include only a single label (and not necessarily the most prominent). Such a setting motivates the use of a robust evaluation measure, which allows for a limited number of labels to be predicted and, as long as one of the predicted labels is correct, the overall prediction should be considered correct. This is indeed the type of evaluation measure used to assess algorithm performance in a recent competition on ImageNet data. Optimizing such types of performance measures presents several hurdles even with existing structured output learning methods. Indeed, many of the current state-of-the-art methods optimize the prediction of only a single output label, ignoring this 'structure' altogether. In this paper, we show how to directly optimize continuous surrogates of such performance measures using structured output learning techniques with latent variables. We use the output of existing binary classifiers as input features in a new learning stage which optimizes the structured loss corresponding to the robust performance measure. We present empirical evidence that this allows us to 'boost' the performance of existing binary classifiers which are the state-of-the-art for the task of object classification in ImageNet.
UR - http://www.scopus.com/inward/record.url?scp=80051747835&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-23094-3_26
DO - 10.1007/978-3-642-23094-3_26
M3 - Conference contribution
SN - 9783642230936
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 355
EP - 368
BT - Energy Minimazation Methods in Computer Vision and Pattern Recognition - 8th International Conference, EMMCVPR 2011, Proceedings
Y2 - 25 July 2011 through 27 July 2011
ER -