Recurrent Neural Networks

1. Overview

1.1. Definition:

Recurrent Neural Networks (RNNs) are a class of artificial neural networks designed to recognize patterns in sequences of data, such as time series or natural language.

1.2. Key Features:

  • Sequence Handling: RNNs possess an internal state (memory) that allows them to process sequences by maintaining a hidden state from previous inputs.
  • Recurrent Connections: Unlike feedforward neural networks, RNNs have loops to enable information to persist. This looping allows the network to model temporal dynamics.

1.3. Architecture:

  • Hidden Layers: RNNs typically consist of input, hidden, and output layers. The hidden layers have a recurrent connection with a loop that feeds back into the layer.
  • Activation Functions: Commonly used activation functions in RNNs include the tanh and ReLU functions. However, due to issues like vanishing gradients, other architectures like Long Short-Term Memory (LSTM) or Gated Recurrent Unit (GRU) are often used.

1.4. Challenges:

  • Vanishing/Exploding Gradients: Because RNNs backpropagate through time, they often face these gradient issues, where gradients may become too small or too large to manage effectively.

1.5. Advancements:

  • LSTM and GRU: Variants of RNNs like LSTM and GRU are designed to better capture long-range dependencies without being affected significantly by gradient issues.
Tags::arch:ml: