TY - GEN
T1 - A multi-modal graphical model for scene analysis
AU - Namin, Sarah Taghavi
AU - Najafi, Mohammad
AU - Salzmann, Mathieu
AU - Petersson, Lars
N1 - Publisher Copyright:
© 2015 IEEE.
PY - 2015/2/19
Y1 - 2015/2/19
N2 - In this paper, we introduce a multi-modal graphical model to address the problems of semantic segmentation using 2D-3D data exhibiting extensive many-to-one correspondences. Existing methods often impose a hard correspondence between the 2D and 3D data, where the 2D and 3D corresponding regions are forced to receive identical labels. This results in performance degradation due to misalignments, 3D-2D projection errors and occlusions. We address this issue by defining a graph over the entire set of data that models soft correspondences between the two modalities. This graph encourages each region in a modality to leverage the information from its corresponding regions in the other modality to better estimate its class label. We evaluate our method on a publicly available dataset and beat the state-of-the-art. Additionally, to demonstrate the ability of our model to support multiple correspondences for objects in 3D and 2D domains, we introduce a new multi-modal dataset, which is composed of panoramic images and LIDAR data, and features a rich set of many-to-one correspondences.
AB - In this paper, we introduce a multi-modal graphical model to address the problems of semantic segmentation using 2D-3D data exhibiting extensive many-to-one correspondences. Existing methods often impose a hard correspondence between the 2D and 3D data, where the 2D and 3D corresponding regions are forced to receive identical labels. This results in performance degradation due to misalignments, 3D-2D projection errors and occlusions. We address this issue by defining a graph over the entire set of data that models soft correspondences between the two modalities. This graph encourages each region in a modality to leverage the information from its corresponding regions in the other modality to better estimate its class label. We evaluate our method on a publicly available dataset and beat the state-of-the-art. Additionally, to demonstrate the ability of our model to support multiple correspondences for objects in 3D and 2D domains, we introduce a new multi-modal dataset, which is composed of panoramic images and LIDAR data, and features a rich set of many-to-one correspondences.
UR - http://www.scopus.com/inward/record.url?scp=84925431004&partnerID=8YFLogxK
U2 - 10.1109/WACV.2015.139
DO - 10.1109/WACV.2015.139
M3 - Conference contribution
T3 - Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015
SP - 1006
EP - 1013
BT - Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2015 15th IEEE Winter Conference on Applications of Computer Vision, WACV 2015
Y2 - 5 January 2015 through 9 January 2015
ER -