Towards Real-Time Multi-Object Tracking

Zhongdao Wang, Liang Zheng, Yixuan Liu, Yali Li, Shengjin Wang*

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    648 Citations (Scopus)

    Abstract

    Modern multiple object tracking (MOT) systems usually follow the tracking-by-detection paradigm. It has 1) a detection model for target localization and 2) an appearance embedding model for data association. Having the two models separately executed might lead to efficiency problems, as the running time is simply a sum of the two steps without investigating potential structures that can be shared between them. Existing research efforts on real-time MOT usually focus on the association step, so they are essentially real-time association methods but not real-time MOT system. In this paper, we propose an MOT system that allows target detection and appearance embedding to be learned in a shared model. Specifically, we incorporate the appearance embedding model into a single-shot detector, such that the model can simultaneously output detections and the corresponding embeddings. We further propose a simple and fast association method that works in conjunction with the joint model. In both components the computation cost is significantly reduced compared with former MOT systems, resulting in a neat and fast baseline for future follow-ups on real-time MOT algorithm design. To our knowledge, this work reports the first (near) real-time MOT system, with a running speed of 22 to 40 FPS depending on the input resolution. Meanwhile, its tracking accuracy is comparable to the state-of-the-art trackers embodying separate detection and embedding (SDE) learning (64.4 % MOTA v.s. 66.1 % MOTA on MOT-16 challenge). Code and models are available at https://github.com/Zhongdao/Towards-Realtime-MOT.

    Original languageEnglish
    Title of host publicationComputer Vision – ECCV 2020 - 16th European Conference, 2020, Proceedings
    EditorsAndrea Vedaldi, Horst Bischof, Thomas Brox, Jan-Michael Frahm
    PublisherSpringer Science and Business Media Deutschland GmbH
    Pages107-122
    Number of pages16
    ISBN (Print)9783030586201
    DOIs
    Publication statusPublished - 2020
    Event16th European Conference on Computer Vision, ECCV 2020 - Glasgow, United Kingdom
    Duration: 23 Aug 202028 Aug 2020

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume12356 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference16th European Conference on Computer Vision, ECCV 2020
    Country/TerritoryUnited Kingdom
    CityGlasgow
    Period23/08/2028/08/20

    Fingerprint

    Dive into the research topics of 'Towards Real-Time Multi-Object Tracking'. Together they form a unique fingerprint.

    Cite this