TY - GEN
T1 - Learning spatial transforms for refining object segment proposals
AU - Zhang, Haoyang
AU - He, Xuming
AU - Porikli, Fatih
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/5/11
Y1 - 2017/5/11
N2 - We address the problem of object segment proposal generation, which is a critical step in many instance-level semantic segmentation and scene understanding pipelines. In contrast to prior works that predict binary segment masks from images, we take an alternative refinement approach to improve the quality of a given segment candidate pool. In particular, we propose an efficient deep network that learns 2D spatial transforms to warp an initial object mask towards nearby object region. We formulate this segment refinement task as a regression problem and design a novel feature pooling strategy in our deep network to predict an affine transformation for each object mask. We evaluate our method extensively on two challenging public benchmarks and apply our refinement network to three different initial segment proposal settings. Our results show sizable improvements in average recall across all the settings, achieving the state-of-The-Art performances.
AB - We address the problem of object segment proposal generation, which is a critical step in many instance-level semantic segmentation and scene understanding pipelines. In contrast to prior works that predict binary segment masks from images, we take an alternative refinement approach to improve the quality of a given segment candidate pool. In particular, we propose an efficient deep network that learns 2D spatial transforms to warp an initial object mask towards nearby object region. We formulate this segment refinement task as a regression problem and design a novel feature pooling strategy in our deep network to predict an affine transformation for each object mask. We evaluate our method extensively on two challenging public benchmarks and apply our refinement network to three different initial segment proposal settings. Our results show sizable improvements in average recall across all the settings, achieving the state-of-The-Art performances.
UR - http://www.scopus.com/inward/record.url?scp=85020198756&partnerID=8YFLogxK
U2 - 10.1109/WACV.2017.12
DO - 10.1109/WACV.2017.12
M3 - Conference contribution
T3 - Proceedings - 2017 IEEE Winter Conference on Applications of Computer Vision, WACV 2017
SP - 37
EP - 46
BT - Proceedings - 2017 IEEE Winter Conference on Applications of Computer Vision, WACV 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE Winter Conference on Applications of Computer Vision, WACV 2017
Y2 - 24 March 2017 through 31 March 2017
ER -