TY - GEN
T1 - Learning a perspective-embedded deconvolution network for crowd counting
AU - Zhao, Muming
AU - Zhang, Jian
AU - Porikli, Fatih
AU - Zhang, Chongyang
AU - Zhang, Wenjun
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/28
Y1 - 2017/8/28
N2 - We present a novel deep learning framework for crowd counting by learning a perspective-embedded deconvolution network. Perspective is an inherent property of most surveillance scenes. Unlike the traditional approaches that exploit the perspective as a separate normalization, we propose to fuse the perspective into a deconvolution network, aiming to obtain a robust, accurate and consistent crowd density map. Through layer-wise fusion, we merge perspective maps at different resolutions into the deconvolution network. With the injection of perspective, our network is driven to learn to combine the underlying scene geometric constraints adaptively, thus enabling an accurate interpretation from high-level feature maps to the pixel-wise crowd density map. In addition, our network allows generating density map for arbitrary-sized input in an end-to-end fashion. The proposed method achieves competitive result on the WorldExpo2010 crowd dataset.
AB - We present a novel deep learning framework for crowd counting by learning a perspective-embedded deconvolution network. Perspective is an inherent property of most surveillance scenes. Unlike the traditional approaches that exploit the perspective as a separate normalization, we propose to fuse the perspective into a deconvolution network, aiming to obtain a robust, accurate and consistent crowd density map. Through layer-wise fusion, we merge perspective maps at different resolutions into the deconvolution network. With the injection of perspective, our network is driven to learn to combine the underlying scene geometric constraints adaptively, thus enabling an accurate interpretation from high-level feature maps to the pixel-wise crowd density map. In addition, our network allows generating density map for arbitrary-sized input in an end-to-end fashion. The proposed method achieves competitive result on the WorldExpo2010 crowd dataset.
KW - Crowd counting
KW - Deconvolution network
KW - Perspective
UR - http://www.scopus.com/inward/record.url?scp=85030219856&partnerID=8YFLogxK
U2 - 10.1109/ICME.2017.8019501
DO - 10.1109/ICME.2017.8019501
M3 - Conference contribution
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
SP - 403
EP - 408
BT - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
PB - IEEE Computer Society
T2 - 2017 IEEE International Conference on Multimedia and Expo, ICME 2017
Y2 - 10 July 2017 through 14 July 2017
ER -