TY - GEN
T1 - Tips and Tricks for Visual Question Answering
T2 - 31st Meeting of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
AU - Teney, Damien
AU - Anderson, Peter
AU - He, Xiaodong
AU - Hengel, Anton Van Den
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/14
Y1 - 2018/12/14
N2 - Deep Learning has had a transformative impact on Computer Vision, but for all of the success there is also a significant cost. This is that the models and procedures used are so complex and intertwined that it is often impossible to distinguish the impact of the individual design and engineering choices each model embodies. This ambiguity diverts progress in the field, and leads to a situation where developing a state-of-the-art model is as much an art as a science. As a step towards addressing this problem we present a massive exploration of the effects of the myriad architectural and hyperparameter choices that must be made in generating a state-of-the-art model. The model is of particular interest because it won the 2017 Visual Question Answering Challenge. We provide a detailed analysis of the impact of each choice on model performance, in the hope that it will inform others in developing models, but also that it might set a precedent that will accelerate scientific progress in the field.
AB - Deep Learning has had a transformative impact on Computer Vision, but for all of the success there is also a significant cost. This is that the models and procedures used are so complex and intertwined that it is often impossible to distinguish the impact of the individual design and engineering choices each model embodies. This ambiguity diverts progress in the field, and leads to a situation where developing a state-of-the-art model is as much an art as a science. As a step towards addressing this problem we present a massive exploration of the effects of the myriad architectural and hyperparameter choices that must be made in generating a state-of-the-art model. The model is of particular interest because it won the 2017 Visual Question Answering Challenge. We provide a detailed analysis of the impact of each choice on model performance, in the hope that it will inform others in developing models, but also that it might set a precedent that will accelerate scientific progress in the field.
UR - http://www.scopus.com/inward/record.url?scp=85061750330&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2018.00444
DO - 10.1109/CVPR.2018.00444
M3 - Conference contribution
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 4223
EP - 4232
BT - Proceedings - 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2018
PB - IEEE Computer Society
Y2 - 18 June 2018 through 22 June 2018
ER -