Learning multi-view neighborhood preserving projections

Novi Quadrianto*, Christoph H. Lampert

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

92 Citations (Scopus)

Abstract

We address the problem of metric learning for multi-view data, namely the construction of embedding projections from data in different representations into a shared feature space, such that the Euclidean distance in this space provides a meaningful within-view as well as between-view similarity. Our motivation stems from the problem of cross-media retrieval tasks, where the availability of a joint Euclidean distance function is a pre-requisite to allow fast, in particular hashing-based, nearest neighbor queries. We formulate an objective function that expresses the intuitive concept that matching samples are mapped closely together in the output space, whereas non-matching samples are pushed apart, no matter in which view they are available. The resulting optimization problem is not convex, but it can be decomposed explicitly into a convex and a concave part, thereby allowing efficient optimization using the convex-concave procedure. Experiments on an image retrieval task show that nearest-neighbor based cross-view retrieval is indeed possible, and the proposed technique improves the retrieval accuracy over baseline techniques.

Original languageEnglish
Title of host publicationProceedings of the 28th International Conference on Machine Learning, ICML 2011
Pages425-432
Number of pages8
Publication statusPublished - 2011
Externally publishedYes
Event28th International Conference on Machine Learning, ICML 2011 - Bellevue, WA, United States
Duration: 28 Jun 20112 Jul 2011

Publication series

NameProceedings of the 28th International Conference on Machine Learning, ICML 2011

Conference

Conference28th International Conference on Machine Learning, ICML 2011
Country/TerritoryUnited States
CityBellevue, WA
Period28/06/112/07/11

Fingerprint

Dive into the research topics of 'Learning multi-view neighborhood preserving projections'. Together they form a unique fingerprint.

Cite this