TY - GEN
T1 - Accelerating the divisive information-theoretic clustering of visual words
AU - Zhang, Jianjia
AU - Wang, Lei
AU - Liu, Lingqiao
AU - Zhou, Luping
AU - Li, Wanqing
PY - 2013
Y1 - 2013
N2 - Word clustering is an effective approach in the bag-of-words model to reducing the dimensionality of high-dimensional features. In recent years, the bag-of-words model has been successfully introduced into visual recognition and significantly developed. Often, in order to adequately model the complex and diversified visual patterns, a large number of visual words are used, especially in the state-of-the-art visual recognition methods. As a result, the existing word clustering algorithms become not computationally efficient enough. They can considerably prolong the process such as model updating and parameter tuning, where word clustering needs to be repeatedly employed. In this paper, we focus on the divisive information-theoretic clustering, one of the most efficient word clustering algorithms in the field of text analysis, and accelerate its speed to better deal with a large number of visual words. We discuss the properties of its cluster membership evaluation function, KL-divergence, in both binary and multi-class classification cases and develop the accelerated versions in two different ways. Theoretical analysis shows that the proposed accelerated divisive information-theoretic clustering algorithm can handle a large number of visual words in a much more efficient manner. As demonstrated on the benchmark datasets in visual recognition, it can achieve speedup by hundreds of times while well maintaining the clustering performance of the original algorithm.
AB - Word clustering is an effective approach in the bag-of-words model to reducing the dimensionality of high-dimensional features. In recent years, the bag-of-words model has been successfully introduced into visual recognition and significantly developed. Often, in order to adequately model the complex and diversified visual patterns, a large number of visual words are used, especially in the state-of-the-art visual recognition methods. As a result, the existing word clustering algorithms become not computationally efficient enough. They can considerably prolong the process such as model updating and parameter tuning, where word clustering needs to be repeatedly employed. In this paper, we focus on the divisive information-theoretic clustering, one of the most efficient word clustering algorithms in the field of text analysis, and accelerate its speed to better deal with a large number of visual words. We discuss the properties of its cluster membership evaluation function, KL-divergence, in both binary and multi-class classification cases and develop the accelerated versions in two different ways. Theoretical analysis shows that the proposed accelerated divisive information-theoretic clustering algorithm can handle a large number of visual words in a much more efficient manner. As demonstrated on the benchmark datasets in visual recognition, it can achieve speedup by hundreds of times while well maintaining the clustering performance of the original algorithm.
UR - http://www.scopus.com/inward/record.url?scp=84893269588&partnerID=8YFLogxK
U2 - 10.1109/DICTA.2013.6691476
DO - 10.1109/DICTA.2013.6691476
M3 - Conference contribution
SN - 9781479921263
T3 - 2013 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2013
BT - 2013 International Conference on Digital Image Computing
T2 - 2013 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2013
Y2 - 26 November 2013 through 28 November 2013
ER -