TY - GEN
T1 - An Input Residual Connection for Simplifying Gated Recurrent Neural Networks
AU - Kuo, Nicholas I.H.
AU - Harandi, Mehrtash
AU - Fourrier, Nicolas
AU - Walder, Christian
AU - Ferraro, Gabriela
AU - Suominen, Hanna
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/7
Y1 - 2020/7
N2 - Gated Recurrent Neural Networks (GRNNs) are important models that continue to push the state-of-the-art solutions across different machine learning problems. However, they are composed of intricate components that are generally not well understood. We increase GRNN interpretability by linking the canonical Gated Recurrent Unit (GRU) design to the well-studied Hopfield network. This connection allowed us to identify network redundancies, which we simplified with an Input Residual Connection (IRC). We tested GRNNs against their IRC counterparts on language modelling. In addition, we proposed an Input Highway Connection (IHC) as an advance application of the IRC and then evaluated the most widely applied GRNN of the Long Short-Term Memory (LSTM) and IHC-LSTM on tasks of i) image generation and ii) learning to learn to update another learner-network. Despite parameter reductions, all IRC-GRNNs showed either comparative or superior generalisation than their baseline models. Furthermore, compared to LSTM, the IHC-LSTM removed 85.4% parameters on image generation. In conclusion, the IRC is applicable, but not limited, to the GRNN designs of GRUs and LSTMs but also to FastGRNNs, Simple Recurrent Units (SRUs), and Strongly-Typed Recurrent Neural Networks (T-RNNs).
AB - Gated Recurrent Neural Networks (GRNNs) are important models that continue to push the state-of-the-art solutions across different machine learning problems. However, they are composed of intricate components that are generally not well understood. We increase GRNN interpretability by linking the canonical Gated Recurrent Unit (GRU) design to the well-studied Hopfield network. This connection allowed us to identify network redundancies, which we simplified with an Input Residual Connection (IRC). We tested GRNNs against their IRC counterparts on language modelling. In addition, we proposed an Input Highway Connection (IHC) as an advance application of the IRC and then evaluated the most widely applied GRNN of the Long Short-Term Memory (LSTM) and IHC-LSTM on tasks of i) image generation and ii) learning to learn to update another learner-network. Despite parameter reductions, all IRC-GRNNs showed either comparative or superior generalisation than their baseline models. Furthermore, compared to LSTM, the IHC-LSTM removed 85.4% parameters on image generation. In conclusion, the IRC is applicable, but not limited, to the GRNN designs of GRUs and LSTMs but also to FastGRNNs, Simple Recurrent Units (SRUs), and Strongly-Typed Recurrent Neural Networks (T-RNNs).
KW - GRU
KW - Hopfield network
KW - LSTM
KW - interpretability
UR - http://www.scopus.com/inward/record.url?scp=85093828067&partnerID=8YFLogxK
U2 - 10.1109/IJCNN48605.2020.9207238
DO - 10.1109/IJCNN48605.2020.9207238
M3 - Conference contribution
T3 - Proceedings of the International Joint Conference on Neural Networks
BT - 2020 International Joint Conference on Neural Networks, IJCNN 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 International Joint Conference on Neural Networks, IJCNN 2020
Y2 - 19 July 2020 through 24 July 2020
ER -