TY - GEN
T1 - Learning to generate object segment proposals with multi-modal cues
AU - Zhang, Haoyang
AU - He, Xuming
AU - Porikli, Fatih
N1 - Publisher Copyright:
© Springer International Publishing AG 2017.
PY - 2017
Y1 - 2017
N2 - This paper presents a learning-based object segmentation proposal generation method for stereo images. Unlike existing methods which mostly rely on low-level appearance cue and handcrafted similarity functions to group segments, our method makes use of learned deep features and designed geometric features to represent a region, as well as a learned similarity network to guide the grouping process. Given an initial segmentation hierarchy, we sequentially merge adjacent regions in each level based on their affinity measured by the similarity network. This merging process generates new segmentation hierarchies, which are then used to produce a pool of regional proposals by taking region singletons, pairs, triplets and 4-tuples from them. In addition, we learn a ranking network that predicts the objectness score of each regional proposal and diversify the ranking based on Maximum Marginal Relevance measures. Experiments on the Cityscapes dataset show that our approach performs significantly better than the baseline and the current state-of-the-art.
AB - This paper presents a learning-based object segmentation proposal generation method for stereo images. Unlike existing methods which mostly rely on low-level appearance cue and handcrafted similarity functions to group segments, our method makes use of learned deep features and designed geometric features to represent a region, as well as a learned similarity network to guide the grouping process. Given an initial segmentation hierarchy, we sequentially merge adjacent regions in each level based on their affinity measured by the similarity network. This merging process generates new segmentation hierarchies, which are then used to produce a pool of regional proposals by taking region singletons, pairs, triplets and 4-tuples from them. In addition, we learn a ranking network that predicts the objectness score of each regional proposal and diversify the ranking based on Maximum Marginal Relevance measures. Experiments on the Cityscapes dataset show that our approach performs significantly better than the baseline and the current state-of-the-art.
UR - http://www.scopus.com/inward/record.url?scp=85016055719&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-54181-5_8
DO - 10.1007/978-3-319-54181-5_8
M3 - Conference contribution
SN - 9783319541808
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 121
EP - 136
BT - Computer Vision - ACCV 2016 - 13th Asian Conference on Computer Vision, Revised Selected Papers
A2 - Sato, Yoichi
A2 - Nishino, Ko
A2 - Lepetit, Vincent
A2 - Lai, Shang-Hong
PB - Springer Verlag
T2 - 13th Asian Conference on Computer Vision, ACCV 2016
Y2 - 20 November 2016 through 24 November 2016
ER -