TY - GEN
T1 - Integrated deep and shallow networks for salient object detection
AU - Zhang, Jing
AU - Li, Bo
AU - Dai, Yuchao
AU - Porikli, Fatih
AU - He, Mingyi
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/7/2
Y1 - 2017/7/2
N2 - Deep convolutional neural network (CNN) based salient object detection methods have achieved state-of-the-art performance and outperform those unsupervised methods with a wide margin. In this paper, we propose to integrate deep and unsupervised saliency for salient object detection under a unified framework. Specifically, our method takes results of unsupervised saliency (Robust Background Detection, RBD) and normalized color images as inputs, and directly learns an end-to-end mapping between inputs and the corresponding saliency maps. The color images are fed into a Fully Convolutional Neural Networks (FCNN) adapted from semantic segmentation to exploit high-level semantic cues for salient object detection. Then the results from deep FCNN and RBD are concatenated to feed into a shallow network to map the concatenated feature maps to saliency maps. Finally, to obtain a spatially consistent saliency map with sharp object boundaries, we fuse superpixel level saliency map at multi-scale. Extensive experimental results on 8 benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches with a margin.
AB - Deep convolutional neural network (CNN) based salient object detection methods have achieved state-of-the-art performance and outperform those unsupervised methods with a wide margin. In this paper, we propose to integrate deep and unsupervised saliency for salient object detection under a unified framework. Specifically, our method takes results of unsupervised saliency (Robust Background Detection, RBD) and normalized color images as inputs, and directly learns an end-to-end mapping between inputs and the corresponding saliency maps. The color images are fed into a Fully Convolutional Neural Networks (FCNN) adapted from semantic segmentation to exploit high-level semantic cues for salient object detection. Then the results from deep FCNN and RBD are concatenated to feed into a shallow network to map the concatenated feature maps to saliency maps. Finally, to obtain a spatially consistent saliency map with sharp object boundaries, we fuse superpixel level saliency map at multi-scale. Extensive experimental results on 8 benchmark datasets demonstrate that the proposed method outperforms the state-of-the-art approaches with a margin.
KW - Fully convolutional neural networks
KW - Multi-scale fusion
KW - Robust background detection
KW - Salient object detection
KW - Shallow network
UR - http://www.scopus.com/inward/record.url?scp=85045302206&partnerID=8YFLogxK
U2 - 10.1109/ICIP.2017.8296539
DO - 10.1109/ICIP.2017.8296539
M3 - Conference contribution
T3 - Proceedings - International Conference on Image Processing, ICIP
SP - 1537
EP - 1541
BT - 2017 IEEE International Conference on Image Processing, ICIP 2017 - Proceedings
PB - IEEE Computer Society
T2 - 24th IEEE International Conference on Image Processing, ICIP 2017
Y2 - 17 September 2017 through 20 September 2017
ER -