3D box proposals from a single monocular image of an indoor scene

Wei Zhuo, Mathieu Salzmann, Xuming He, Miaomiao Liu

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    4 Citations (Scopus)

    Abstract

    Modern object detection methods typically rely on bounding box proposals as input. While initially popularized in the 2D case, this idea has received increasing attention for 3D bounding boxes. Nevertheless, existing 3D box proposal techniques all assume having access to depth as input, which is unfortunately not always available in practice. In this paper, we therefore introduce an approach to generating 3D box proposals from a single monocular RGB image. To this end, we develop an integrated, fully differentiable framework that inherently predicts a depth map, extracts a 3D volumetric scene representation and generates 3D object proposals. At the core of our approach lies a novel residual, differentiable truncated signed distance function module, which, accounting for the relatively low accuracy of the predicted depth map, extracts a 3D volumetric representation of the scene. Our experiments on the standard NYUv2 dataset demonstrate that our framework lets us generate high-quality 3D box proposals and that it outperforms the two-stage technique consisting of successively performing state-of-the-art depth prediction and depth-based 3D proposal generation.

    Original languageEnglish
    Title of host publication32nd AAAI Conference on Artificial Intelligence, AAAI 2018
    PublisherAAAI Press
    Pages7639-7647
    Number of pages9
    ISBN (Electronic)9781577358008
    Publication statusPublished - 2018
    Event32nd AAAI Conference on Artificial Intelligence, AAAI 2018 - New Orleans, United States
    Duration: 2 Feb 20187 Feb 2018

    Publication series

    Name32nd AAAI Conference on Artificial Intelligence, AAAI 2018

    Conference

    Conference32nd AAAI Conference on Artificial Intelligence, AAAI 2018
    Country/TerritoryUnited States
    CityNew Orleans
    Period2/02/187/02/18

    Fingerprint

    Dive into the research topics of '3D box proposals from a single monocular image of an indoor scene'. Together they form a unique fingerprint.

    Cite this