Not All Negatives Are Equal: Learning to Track with Multiple Background Clusters

Gao Zhu, Fatih Porikli, Hongdong Li

    Research output: Contribution to journalArticlepeer-review

    5 Citations (Scopus)


    Conventional tracking-by-detection approaches for visual object tracking often assume that the task at hand is a binary foreground-versus-background classification problem, in which the background is a single, generic, and all-inclusive class. In contrast, here we argue that the background appearance, for the most part, possesses a more complicated structure that would benefit from further partitioning into multiple contextual clusters. Our observation is that, although the background class is contemplated to contain a vast intra-class variation, during the tracking process, only a small portion of this diversity is present at the current frame around the foreground object. This observation motivates us to build multiple fine-grained foreground-versus-contextual-cluster models that provide more discriminative classifications, and consequently more robust and accurate foreground object tracking. For each cluster, we employ a structured output support vector machine (SSVM), and in an online manner, we combine the responses of multiple classifiers. To this end, we apply a top-level SSVM that models the tracked foreground object. We show that our refined modeling of the background is better than naïvely growing the complexity of a single foreground-background classifier, i.e., increasing the number of support vectors that existing approaches rely on, which cause overfitting issues. Our extensive evaluations on large benchmark data sets demonstrate that our tracker consistently outperforms the current state-of-the-art while having comparable computational requirements.

    Original languageEnglish
    Article number7583707
    Pages (from-to)314-326
    Number of pages13
    JournalIEEE Transactions on Circuits and Systems for Video Technology
    Issue number2
    Publication statusPublished - Feb 2018


    Dive into the research topics of 'Not All Negatives Are Equal: Learning to Track with Multiple Background Clusters'. Together they form a unique fingerprint.

    Cite this