Recurrent Highway Networks with Grouped Auxiliary Memory

IEEE Access 2019  ·  Wei Luo ; Feng Yu ·

Recurrent neural networks (RNNs) are challenging to train, let alone those with deep spatial structures. Architectures built upon highway connections such as Recurrent Highway Network (RHN) were developed to allow larger step-to-step transition depth, leading to more expressive models. However, problems that require capturing long-term dependencies still can not be well addressed by these models. Moreover, the ability to keep long-term memories tends to diminish when the spatial depth increases, since deeper structure may accelerate gradient vanishing. In this paper, we address these issues by proposing a novel RNN architecture based on RHN, namely the Recurrent Highway Network with Grouped Auxiliary Memory (GAM-RHN). The proposed architecture interconnects the RHN with a set of auxiliary memory units specifically for storing long-term information via reading and writing operations, which is analogous to Memory Augmented Neural Networks (MANNs). Experimental results on artificial long time lag tasks show that GAM-RHNs can be trained efficiently while being deep in both time and space. We also evaluate the proposed architecture on a variety of tasks, including language modeling, sequential image classification, and financial market forecasting. The potential of our approach is demonstrated by achieving state-of-the-art results on these tasks.

PDF

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Stock Trend Prediction FI-2010 BL-GAM-RHN-7 F1 (H50) 0.8088 # 1
Accuracy (H50) 0.8202 # 1
Language Modelling Penn Treebank (Character Level) GAM-RHN-5 Bit per Character (BPC) 1.147 # 3
Number of params 16.0M # 6
Sequential Image Classification Sequential MNIST GAM-RHN-1 Permuted Accuracy 96.8% # 17
Language Modelling Text8 GAM-RHN-10 Bit per Character (BPC) 1.157 # 12
Number of params 44.7M # 11

Methods