TY - GEN
T1 - Proximal mean-field for neural network quantization
AU - Ajanthan, Thalaiyasingam
AU - Dokania, Puneet
AU - Hartley, Richard
AU - Torr, Philip
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/10
Y1 - 2019/10
N2 - Compressing large Neural Networks (NN) by quantizing the parameters, while maintaining the performance is highly desirable due to reduced memory and time complexity. In this work, we cast NN quantization as a discrete labelling problem, and by examining relaxations, we design an efficient iterative optimization procedure that involves stochastic gradient descent followed by a projection. We prove that our simple projected gradient descent approach is, in fact, equivalent to a proximal version of the well-known mean-field method. These findings would allow the decades-old and theoretically grounded research on MRF optimization to be used to design better network quantization schemes. Our experiments on standard classification datasets (MNIST, CIFAR10/100, TinyImageNet) with convolutional and residual architectures show that our algorithm obtains fully-quantized networks with accuracies very close to the floating-point reference networks.
AB - Compressing large Neural Networks (NN) by quantizing the parameters, while maintaining the performance is highly desirable due to reduced memory and time complexity. In this work, we cast NN quantization as a discrete labelling problem, and by examining relaxations, we design an efficient iterative optimization procedure that involves stochastic gradient descent followed by a projection. We prove that our simple projected gradient descent approach is, in fact, equivalent to a proximal version of the well-known mean-field method. These findings would allow the decades-old and theoretically grounded research on MRF optimization to be used to design better network quantization schemes. Our experiments on standard classification datasets (MNIST, CIFAR10/100, TinyImageNet) with convolutional and residual architectures show that our algorithm obtains fully-quantized networks with accuracies very close to the floating-point reference networks.
UR - http://www.scopus.com/inward/record.url?scp=85081910726&partnerID=8YFLogxK
U2 - 10.1109/ICCV.2019.00497
DO - 10.1109/ICCV.2019.00497
M3 - Conference contribution
T3 - Proceedings of the IEEE International Conference on Computer Vision
SP - 4870
EP - 4879
BT - Proceedings - 2019 International Conference on Computer Vision, ICCV 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 17th IEEE/CVF International Conference on Computer Vision, ICCV 2019
Y2 - 27 October 2019 through 2 November 2019
ER -