Open-World Stereo Video Matching with Deep RNN

Yiran Zhong*, Hongdong Li, Yuchao Dai

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    13 Citations (Scopus)

    Abstract

    Deep Learning based stereo matching methods have shown great successes and achieved top scores across different benchmarks. However, like most data-driven methods, existing deep stereo matching networks suffer from some well-known drawbacks such as requiring large amount of labeled training data, and that their performances are fundamentally limited by the generalization ability. In this paper, we propose a novel Recurrent Neural Network (RNN) that takes a continuous (possibly previously unseen) stereo video as input, and directly predicts a depth-map at each frame without a pre-training process, and without the need of ground-truth depth-maps as supervision. Thanks to the recurrent nature (provided by two convolutional-LSTM blocks), our network is able to memorize and learn from its past experiences, and modify its inner parameters (network weights) to adapt to previously unseen or unfamiliar environments. This suggests a remarkable generalization ability of the net, making it applicable in an open world setting. Our method works robustly with changes in scene content, image statistics, and lighting and season conditions etc. By extensive experiments, we demonstrate that the proposed method seamlessly adapts between different scenarios. Equally important, in terms of the stereo matching accuracy, it outperforms state-of-the-art deep stereo approaches on standard benchmark datasets such as KITTI and Middlebury stereo.

    Original languageEnglish
    Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
    EditorsMartial Hebert, Yair Weiss, Vittorio Ferrari, Cristian Sminchisescu
    PublisherSpringer Verlag
    Pages104-119
    Number of pages16
    ISBN (Print)9783030012151
    DOIs
    Publication statusPublished - 2018
    Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
    Duration: 8 Sept 201814 Sept 2018

    Publication series

    NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
    Volume11206 LNCS
    ISSN (Print)0302-9743
    ISSN (Electronic)1611-3349

    Conference

    Conference15th European Conference on Computer Vision, ECCV 2018
    Country/TerritoryGermany
    CityMunich
    Period8/09/1814/09/18

    Fingerprint

    Dive into the research topics of 'Open-World Stereo Video Matching with Deep RNN'. Together they form a unique fingerprint.

    Cite this