What are Recurrent Neural Networks (RNN)?
Recurrent Neural Networks (RNNs) are a type of neural networks designed to approach time-dependent and/or sequence-dependent problems. RNN are “recurrent” in the sense that they can revisit or reuse past states as inputs to predict the next or future states. To put it plainly, they have memory. Indeed, memory is what allows us to incorporate our past thoughts and behaviors into our future thoughts and behaviors.
The first successful example of a RNN with backpropagation was introduced by Jeffrey Elman, the so-called Elman Network (Elman, 1990). Since the Elman network, outstanding progress has been made with RNN in both basic research and practical applications. RNNs today are used for tasks like machine translation, robotics, speech recognition, speech production, time series prediction, sequential decision making, modeling of brain activity, and many more.
Recommended Path for Learning
- Understanding LSTM Networks
- Modeling sequences: A brief overview - Lecture by Geoffrey Hinton (47 min)
Further Learning
Video
- Recurrent Neural Networks - MIT 6.S191
- CS224n: Natural Language Processing with Deep Learning - Stanford / Winter 2020
- Bhiksha Raj’s Deep Learning Lectures 13, 14, and 15 at Carnegie Mellon University
Applied papers
- Botvinick, M., & Plaut, D. C. (2004). Doing without schema hierarchies: A recurrent connectionist approach to normal and impaired routine sequential action. Psychological Review, 111(2), 395.
- Güçlü, U., & van Gerven, M. A. (2017). Modeling the dynamics of human brain activity with recurrent neural networks. Frontiers in Computational Neuroscience, 11, 7.
- Munakata, Y., McClelland, J. L., Johnson, M. H., & Siegler, R. S. (1997). Rethinking infant knowledge: Toward an adaptive process account of successes and failures in object permanence tasks. Psychological Review, 104(4), 686.
Online tutorials
- NLP From Scratch: Classifying Names with a Character-Level RNN (Pytorch)
- Recurrent Neural Networks (RNN) with Keras
- ML with Recurrent Neural Networks (NLP Zero to Hero - Part 4)
Theory papers and book chapters
- Elman, J. L. (1990). Finding Structure in Time. Cognitive Science, 14(2), 179–211.
- Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166.
- Sequence Modeling: Recurrent and Recursive Nets in Deep Learning by Goodfellow, Bengio, and Courville.
- Recurrent Neural Networks in Dive into Deep Learning by Zhang, Lipton, and Smola.