TY - GEN
T1 - Inferring the Class Conditional Response Map for Weakly Supervised Semantic Segmentation
AU - Sun, Weixuan
AU - Zhang, Jing
AU - Barnes, Nick
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Image-level weakly supervised semantic segmentation (WSSS) relies on class activation maps (CAMs) for pseudo labels generation. As CAMs only highlight the most discriminative regions of objects, the generated pseudo labels are usually unsatisfactory to serve directly as supervision. To solve this, most existing approaches follow a multi-training pipeline to refine CAMs for better pseudo-labels, which includes: 1) re-training the classification model to generate CAMs; 2) post-processing CAMs to obtain pseudo labels; and 3) training a semantic segmentation model with the obtained pseudo labels. However, this multi-training pipeline requires complicated adjustment and additional time. To address this, we propose a class-conditional inference strategy and an activation aware mask refinement loss function to generate better pseudo labels without retraining the classifier. The class conditional inference-time approach is presented to separately and iteratively reveal the classification network's hidden object activation to generate more complete response maps. Further, our activation aware mask refinement loss function introduces a novel way to exploit saliency maps during segmentation training and refine the foreground object masks without suppressing background objects. Our method achieves superior WSSS results without requiring re-training of the classifier. https://github.com/weixuansun/InferCam
AB - Image-level weakly supervised semantic segmentation (WSSS) relies on class activation maps (CAMs) for pseudo labels generation. As CAMs only highlight the most discriminative regions of objects, the generated pseudo labels are usually unsatisfactory to serve directly as supervision. To solve this, most existing approaches follow a multi-training pipeline to refine CAMs for better pseudo-labels, which includes: 1) re-training the classification model to generate CAMs; 2) post-processing CAMs to obtain pseudo labels; and 3) training a semantic segmentation model with the obtained pseudo labels. However, this multi-training pipeline requires complicated adjustment and additional time. To address this, we propose a class-conditional inference strategy and an activation aware mask refinement loss function to generate better pseudo labels without retraining the classifier. The class conditional inference-time approach is presented to separately and iteratively reveal the classification network's hidden object activation to generate more complete response maps. Further, our activation aware mask refinement loss function introduces a novel way to exploit saliency maps during segmentation training and refine the foreground object masks without suppressing background objects. Our method achieves superior WSSS results without requiring re-training of the classifier. https://github.com/weixuansun/InferCam
KW - Grouping and Shape
KW - Segmentation
UR - http://www.scopus.com/inward/record.url?scp=85126122809&partnerID=8YFLogxK
U2 - 10.1109/WACV51458.2022.00271
DO - 10.1109/WACV51458.2022.00271
M3 - Conference contribution
T3 - Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
SP - 2653
EP - 2662
BT - Proceedings - 2022 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2022
Y2 - 4 January 2022 through 8 January 2022
ER -