Cutting edge: Soft correspondences in multimodal scene parsing

Sarah Taghavi Namin, Mohammad Najafi, Mathieu Salzmann, Lars Petersson

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    8 Citations (Scopus)

    Abstract

    Exploiting multiple modalities for semantic scene parsing has been shown to improve accuracy over the single modality scenario. Existing methods, however, assume that corresponding regions in two modalities have the same label. In this paper, we address the problem of data misalignment and label inconsistencies, e.g., due to moving objects, in semantic labeling, which violate the assumption of existing techniques. To this end, we formulate multimodal semantic labeling as inference in a CRF, and introduce latent nodes to explicitly model inconsistencies between two domains. These latent nodes allow us not only to leverage information from both domains to improve their labeling, but also to cut the edges between inconsistent regions. To eliminate the need for hand tuning the parameters of our model, we propose to learn intra-domain and inter-domain potential functions from training data. We demonstrate the benefits of our approach on two publicly available datasets containing 2D imagery and 3D point clouds. Thanks to our latent nodes and our learning strategy, our method outperforms the state-of-the-art in both cases.

    Original languageEnglish
    Title of host publication2015 International Conference on Computer Vision, ICCV 2015
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages1188-1196
    Number of pages9
    ISBN (Electronic)9781467383912
    DOIs
    Publication statusPublished - 17 Feb 2015
    Event15th IEEE International Conference on Computer Vision, ICCV 2015 - Santiago, Chile
    Duration: 11 Dec 201518 Dec 2015

    Publication series

    NameProceedings of the IEEE International Conference on Computer Vision
    Volume2015 International Conference on Computer Vision, ICCV 2015
    ISSN (Print)1550-5499

    Conference

    Conference15th IEEE International Conference on Computer Vision, ICCV 2015
    Country/TerritoryChile
    CitySantiago
    Period11/12/1518/12/15

    Fingerprint

    Dive into the research topics of 'Cutting edge: Soft correspondences in multimodal scene parsing'. Together they form a unique fingerprint.

    Cite this