Methods > General > Working Memory Models

Neural Turing Machine

Introduced by Graves et al. in Neural Turing Machines

A Neural Turing Machine is a working memory neural network model. It couples a neural network architecture with external memory resources. The whole architecture is differentiable end-to-end with gradient descent. The models can infer tasks such as copying, sorting and associative recall.

A Neural Turing Machine (NTM) architecture contains two basic components: a neural network controller and a memory bank. The Figure presents a high-level diagram of the NTM architecture. Like most neural networks, the controller interacts with the external world via input and output vectors. Unlike a standard network, it also interacts with a memory matrix using selective read and write operations. By analogy to the Turing machine we refer to the network outputs that parameterise these operations as “heads.”

Every component of the architecture is differentiable. This is achieved by defining 'blurry' read and write operations that interact to a greater or lesser degree with all the elements in memory (rather than addressing a single element, as in a normal Turing machine or digital computer). The degree of blurriness is determined by an attentional “focus” mechanism that constrains each read and write operation to interact with a small portion of the memory, while ignoring the rest. Because interaction with the memory is highly sparse, the NTM is biased towards storing data without interference. The memory location brought into attentional focus is determined by specialised outputs emitted by the heads. These outputs define a normalised weighting over the rows in the memory matrix (referred to as memory “locations”). Each weighting, one per read or write head, defines the degree to which the head reads or writes at each location. A head can thereby attend sharply to the memory at a single location or weakly to the memory at many locations

Source: Neural Turing Machines

Latest Papers

Unsupervised Speaker Adaptation using Attention-based Speaker Memory for End-to-End ASR
Leda SarıNiko MoritzTakaaki HoriJonathan Le Roux
Memory-Augmented Recurrent Networks for Dialogue Coherence
David DonahueYuanliang MengAnna Rumshisky
A Neural Turing~Machine for Conditional Transition Graph Modeling
Mehdi Ben LazregMorten GoodwinOle-Christoffer Granmo
Understanding Memory Modules on Learning Simple Algorithms
Kexin WangYu ZhouShaonan WangJiajun ZhangChengqing Zong
A review on Neural Turing Machine
Soroor Malekmohammadi FaradonbehFaramarz Safi-Esfahani
Few-Shot Generalization Across Dialogue Tasks
| Vladimir VlasovAkela Drissner-SchmidAlan Nichol
Context-Aware Neural Model for Temporal Information Extraction
Yuanliang MengAnna Rumshisky
A Taxonomy for Neural Memory Networks
Ying MaJose Principe
Meta-Learning via Feature-Label Memory Network
Dawit MurejaHyunsin ParkChang D. Yoo
Attention-Set based Metric Learning for Video Face Recognition
Yibo HuXiang WuRan He
Tracking the World State with Recurrent Entity Networks
| Mikael HenaffJason WestonArthur SzlamAntoine BordesYann LeCun
Neural Turing Machines: Convergence of Copy Tasks
Janez Aleš
Dynamic Neural Turing Machine with Soft and Hard Addressing Schemes
Caglar GulcehreSarath ChandarKyunghyun ChoYoshua Bengio
Lie Access Neural Turing Machine
Greg Yang
Empirical Study on Deep Learning Models for Question Answering
Yang YuWei ZhangChung-Wei HangBing XiangBowen Zhou
A Deep Memory-based Architecture for Sequence-to-Sequence Learning
Fandong MengZhengdong LuZhaopeng TuHang LiQun Liu
Reinforcement Learning Neural Turing Machines - Revised
| Wojciech ZarembaIlya Sutskever
Neural Turing Machines
| Alex GravesGreg WayneIvo Danihelka