TY - GEN
T1 - Learning dynamic hierarchical models for anytime scene labeling
AU - Liu, Buyu
AU - He, Xuming
N1 - Publisher Copyright:
© Springer International Publishing AG 2016.
PY - 2016
Y1 - 2016
N2 - With increasing demand for efficient image and video analysis, test-time cost of scene parsing becomes critical for many large-scale or time-sensitive vision applications. We propose a dynamic hierarchical model for anytime scene labeling that allows us to achieve flexible tradeoffs between efficiency and accuracy in pixel-level prediction. In particular, our approach incorporates the cost of feature computation and model inference, and optimizes the model performance for any given test-time budget by learning a sequence of image-adaptive hierarchical models. We formulate this anytime representation learning as a Markov Decision Process with a discrete-continuous state-action space. A high-quality policy of feature and model selection is learned based on an approximate policy iteration method with action proposal mechanism. We demonstrate the advantages of our dynamic non-myopic anytime scene parsing on three semantic segmentation datasets, which achieves 90% of the state-of-the-art performances by using 15% of their overall costs.
AB - With increasing demand for efficient image and video analysis, test-time cost of scene parsing becomes critical for many large-scale or time-sensitive vision applications. We propose a dynamic hierarchical model for anytime scene labeling that allows us to achieve flexible tradeoffs between efficiency and accuracy in pixel-level prediction. In particular, our approach incorporates the cost of feature computation and model inference, and optimizes the model performance for any given test-time budget by learning a sequence of image-adaptive hierarchical models. We formulate this anytime representation learning as a Markov Decision Process with a discrete-continuous state-action space. A high-quality policy of feature and model selection is learned based on an approximate policy iteration method with action proposal mechanism. We demonstrate the advantages of our dynamic non-myopic anytime scene parsing on three semantic segmentation datasets, which achieves 90% of the state-of-the-art performances by using 15% of their overall costs.
UR - http://www.scopus.com/inward/record.url?scp=84990050134&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-46466-4_39
DO - 10.1007/978-3-319-46466-4_39
M3 - Conference contribution
AN - SCOPUS:84990050134
SN - 9783319464657
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 650
EP - 666
BT - Computer Vision - 14th European Conference, ECCV 2016, Proceedings
A2 - Leibe, Bastian
A2 - Matas, Jiri
A2 - Sebe, Nicu
A2 - Welling, Max
PB - Springer Verlag
T2 - 14th European Conference on Computer Vision, ECCV 2016
Y2 - 8 October 2016 through 16 October 2016
ER -