TY - GEN
T1 - Robust visual vocabulary tracking using hierarchical model fusion
AU - Bozorgtabar, Behzad
AU - Goecke, Roland
PY - 2013
Y1 - 2013
N2 - In this paper, we propose a new visual tracking approach based on the Hierarchical Model Fusion framework, which fuses two different trackers to cope with different tracking problems. We use an Incremental Multiple Principal Component Analysis tracker as our main model as well as an image patch tracker as our auxiliary model. Firstly, we randomly sample image patches within the target region obtained by the main model in the training frames for constructing a visual vocabulary using Histogram of Oriented Gradient features. Secondly, we use a supervised learning algorithm based on a Gaussian Mixture Model, which not only operates on supervised information to improve the discriminative power of the clusters, but also increases the purity of the clusters. Then, auxiliary models are initialised by obtaining confidence scores of image patches based on the similarity between candidates and codewords. In addition, an updating procedure and a result refinement scheme are included in the proposed tracking approach. Experiments on challenging video sequences demonstrate the robustness of the proposed approach to handling occlusion, pose variation and rotation.
AB - In this paper, we propose a new visual tracking approach based on the Hierarchical Model Fusion framework, which fuses two different trackers to cope with different tracking problems. We use an Incremental Multiple Principal Component Analysis tracker as our main model as well as an image patch tracker as our auxiliary model. Firstly, we randomly sample image patches within the target region obtained by the main model in the training frames for constructing a visual vocabulary using Histogram of Oriented Gradient features. Secondly, we use a supervised learning algorithm based on a Gaussian Mixture Model, which not only operates on supervised information to improve the discriminative power of the clusters, but also increases the purity of the clusters. Then, auxiliary models are initialised by obtaining confidence scores of image patches based on the similarity between candidates and codewords. In addition, an updating procedure and a result refinement scheme are included in the proposed tracking approach. Experiments on challenging video sequences demonstrate the robustness of the proposed approach to handling occlusion, pose variation and rotation.
UR - http://www.scopus.com/inward/record.url?scp=84893225162&partnerID=8YFLogxK
U2 - 10.1109/DICTA.2013.6691525
DO - 10.1109/DICTA.2013.6691525
M3 - Conference contribution
SN - 9781479921263
T3 - 2013 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2013
BT - 2013 International Conference on Digital Image Computing
T2 - 2013 International Conference on Digital Image Computing: Techniques and Applications, DICTA 2013
Y2 - 26 November 2013 through 28 November 2013
ER -