A multi-modal graphical model for scene analysis

Sarah Taghavi Namin, Mohammad Najafi, Mathieu Salzmann, Lars Petersson

    Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

    16 Citations (Scopus)

    Abstract

    In this paper, we introduce a multi-modal graphical model to address the problems of semantic segmentation using 2D-3D data exhibiting extensive many-to-one correspondences. Existing methods often impose a hard correspondence between the 2D and 3D data, where the 2D and 3D corresponding regions are forced to receive identical labels. This results in performance degradation due to misalignments, 3D-2D projection errors and occlusions. We address this issue by defining a graph over the entire set of data that models soft correspondences between the two modalities. This graph encourages each region in a modality to leverage the information from its corresponding regions in the other modality to better estimate its class label. We evaluate our method on a publicly available dataset and beat the state-of-the-art. Additionally, to demonstrate the ability of our model to support multiple correspondences for objects in 3D and 2D domains, we introduce a new multi-modal dataset, which is composed of panoramic images and LIDAR data, and features a rich set of many-to-one correspondences.

    Original languageEnglish
    Title of host publicationProceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages1006-1013
    Number of pages8
    ISBN (Electronic)9781479966820
    DOIs
    Publication statusPublished - 19 Feb 2015
    Event2015 15th IEEE Winter Conference on Applications of Computer Vision, WACV 2015 - Waikoloa, United States
    Duration: 5 Jan 20159 Jan 2015

    Publication series

    NameProceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015

    Conference

    Conference2015 15th IEEE Winter Conference on Applications of Computer Vision, WACV 2015
    Country/TerritoryUnited States
    CityWaikoloa
    Period5/01/159/01/15

    Fingerprint

    Dive into the research topics of 'A multi-modal graphical model for scene analysis'. Together they form a unique fingerprint.

    Cite this