Pedestrian detection with motion features via two-stream ConvNets

Ryota Yoshihashi*, Tu Tuan Trinh, Rei Kawakami, Shaodi You, Makoto Iida, Takeshi Naemura

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    6 Citations (Scopus)

    Abstract

    Motion information can be important for detecting objects, but it has been used less for pedestrian detection, particularly with deep-learning-based methods. We propose a method that uses deep motion features as well as deep still-image features, following the success of two-stream convolutional networks, each of which are trained separately for spatial and temporal streams. To extract motion clues for detection differentiated from other background motions, the temporal stream takes as input the difference in frames that are weakly stabilized by optical flow. To make the networks applicable to bounding-box-level detection, the mid-level features are concatenated and combined with a sliding-window detector. We also introduce transfer learning from multiple sources in the two-stream networks, which can transfer still image and motion features from ImageNet and an action recognition dataset respectively, to overcome the insufficiency of training data for convolutional neural networks in pedestrian datasets. We conducted an evaluation on two popular large-scale pedestrian benchmarks, namely the Caltech Pedestrian Detection Benchmark and Daimler Mono Pedestrian Detection Benchmark. We observed 10% improvement compared to the same method but without motion features.

    Original languageEnglish
    Article number12
    JournalIPSJ Transactions on Computer Vision and Applications
    Volume10
    Issue number1
    DOIs
    Publication statusPublished - 1 Dec 2018

    Fingerprint

    Dive into the research topics of 'Pedestrian detection with motion features via two-stream ConvNets'. Together they form a unique fingerprint.

    Cite this