Automatic refinement strategies for manual initialization of object trackers

Hao Zhu*, Fatih Porikli

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    3 Citations (Scopus)

    Abstract

    Tracking objects across multiple frames is a well-investigated problem in computer vision. The majority of the existing algorithms that assume an accurate initialization is readily available. However, in many real-life settings, in particular for applications where the video is streaming in real time, the initialization has to be provided by a human operator. This limitation raises an inevitable uncertainty issue. Here, we first collect a large and new data set of inputs that consists of more than 20 K human initialization clicks, by several subjects under three practical user interface scenarios for the popular TB50 tracking benchmark. We analyze the factors and mechanisms of human input, derive statistical models, and show that human input always contains deviations, which exacerbate further when the relative object-camera motion becomes large. We also design and evaluate alternative refinement schemes, and propose a strategy that refits an object window on the most probable target region after a single click. To compensate for the human initialization errors, our method generates window proposals using objectness cues extracted from color and motion attributes, accumulates them into a likelihood map that is weighted by the initial click position and visual saliency scores, and assigns the final window by the maximum likelihood estimate. Our experiments demonstrate that the presented refinement strategy effectively reduces human input errors.

    Original languageEnglish
    Article number7762927
    Pages (from-to)821-835
    Number of pages15
    JournalIEEE Transactions on Image Processing
    Volume26
    Issue number2
    DOIs
    Publication statusPublished - Feb 2017

    Fingerprint

    Dive into the research topics of 'Automatic refinement strategies for manual initialization of object trackers'. Together they form a unique fingerprint.

    Cite this