TY - GEN
T1 - Intermediate semantics based distance metric learning for video annotation and similarity measurements
AU - Qu, Wen
AU - Zhou, Xiangmin
AU - Wang, Daling
AU - Feng, Shi
AU - Zhang, Yifei
AU - Yu, Ge
N1 - Publisher Copyright:
© Springer International Publishing AG 2016.
PY - 2016
Y1 - 2016
N2 - The similarity metric between videos is integral to several key tasks,including video retrieval,classification and recommendation. Since there is no standard criterion for the similarity measurement between videos except measuring manually,it is difficult to collect large training dataset for distance metric learning algorithms. Moreover,the existing distance metric learning (DML) methods for multimedia data suffer from two critical limitations: (1) they typically attempt to learn a distance function on the single label setting,in which each item is only labeled with single label; (2) they are often designed for learning distance metrics on low-level features,which ignore the semantic similarity of the multimedia data. To address these problems,in this paper,we propose a novel framework of Intermediate Semantics based Distance Learning (ISDL) for video clips,which aims to integrate semantics of multiple modals optimally for distance metric learning. In particular,the proposed framework: (1) generates the training pairs automatically; (2) defines multi-modal concepts for similarity measure among videos; (3) learns the distance metric for video clips based on the intermediate semantics. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms,and the results validate the effectiveness of our proposed approach.
AB - The similarity metric between videos is integral to several key tasks,including video retrieval,classification and recommendation. Since there is no standard criterion for the similarity measurement between videos except measuring manually,it is difficult to collect large training dataset for distance metric learning algorithms. Moreover,the existing distance metric learning (DML) methods for multimedia data suffer from two critical limitations: (1) they typically attempt to learn a distance function on the single label setting,in which each item is only labeled with single label; (2) they are often designed for learning distance metrics on low-level features,which ignore the semantic similarity of the multimedia data. To address these problems,in this paper,we propose a novel framework of Intermediate Semantics based Distance Learning (ISDL) for video clips,which aims to integrate semantics of multiple modals optimally for distance metric learning. In particular,the proposed framework: (1) generates the training pairs automatically; (2) defines multi-modal concepts for similarity measure among videos; (3) learns the distance metric for video clips based on the intermediate semantics. We conduct an extensive set of experiments to evaluate the performance of the proposed algorithms,and the results validate the effectiveness of our proposed approach.
KW - Distance metric learning
KW - Video similarity measure
UR - http://www.scopus.com/inward/record.url?scp=84996490289&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-48740-3_16
DO - 10.1007/978-3-319-48740-3_16
M3 - Conference contribution
SN - 9783319487397
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 227
EP - 242
BT - Web Information Systems Engineering – WISE 2016 - 17th International Conference, Proceedings
A2 - Cellary, Wojciech
A2 - Wang, Jianmin
A2 - Mokbel, Mohamed F.
A2 - Wang, Hua
A2 - Zhou, Rui
A2 - Zhang, Yanchun
PB - Springer Verlag
T2 - 17th International Conference on Web Information Systems Engineering, WISE 2016
Y2 - 8 November 2016 through 10 November 2016
ER -