Jiuru Lyu
  • Home
  • CV
  • Notes
  • Photograph
  • Blogs

On this page

  • Recurrent Neural Networks (RNNs)
  • Edit this page
  • View source
  • Report an issue

12 Recurrent Neural Networks

Neural Networks
Recurrent Neural Networks
LSTM
Deep Learning
This lecture discusses the basics of recurrent neural networks (RNNs), including their architecture, training process, and applications. It also covers the concept of long short-term memory (LSTM) networks and their role in handling sequential data.
Author

Jiuru Lyu

Published

April 7, 2025

Recurrent Neural Networks (RNNs)

  • Application:
    • Language modeling
    • Sequence tagging
    • Text classification
  • RNN is a family of neural networks for processing sequential data of arbitrary length.
    • Output of the layer can connect back to the neuron itself or a layer before it.
    • Share same weights across several time steps.
  • A recurrence function is applied at each step: \[h_t=f_W(h_{t-1},x_t),\] where
    • \(h_t\) is the new state
    • \(f_W\) is a neural network with parameter \(W\)
    • \(h_{t-1}\) is the old state
    • \(x-t\) is the input feature vector at time step \(t\)
  • Vanilla RNN: connect the output of the last layer to the input of the next layer.
    • The problem of long-term dependencies:
      • Appeal of RNN is to connect previous information to current task.
      • Gap between relevant information and where we need it can be large.
      • Long-range dependencies are difficult to learn because of vanishing gradients or exploding gradients.
  • To solve the problem, we introduce LSTM networks and GRU networks.
  • There are other more advanced architectures, such as Attention and Transformer networks.
Back to top

Created with Quarto.
© Copyright 2025, Jiuru Lyu.
Last updated: 2025 Apr. 29.

 
  • Edit this page
  • View source
  • Report an issue