From Basics to Bots: My Weekly AI Engineering Adventure-23

Recurrent Neural Networks (RNNs) - Learning from Sequences

Posted by Afsal on 23-Jan-2026

Hi Pythonistas!

In the last post we learned about CNNs, CNNs are great at looking.
But some problems aren’t about images. They’re about order.

  • Words in a sentence
  • Time series data
  • Speech
  • Stock prices

For these, sequence matters.

Enter Recurrent Neural Networks (RNNs).

Why Dense & CNNs Fall Short for Sequences

Dense layers:

See everything at once
Forget what came before

CNNs:

Capture local patterns
But struggle with long-range dependencies

In sequences:

“I am not happy”

That not changes everything
We need memory.

The Core Idea of RNNs

RNNs process data one step at a time.

At each step, they take:

  • Current input
  • Previous hidden state (memory)
  • And produce:Output
  • New hidden state

Same network, reused again and again.

That’s the recurrent part.

Think of It Like This

An RNN is like:

Reading a sentence word by word Keeping notes in your head Updating understanding as you go

You don’t reread the entire sentence every time you remember.

Why RNNs Are Powerful

  • Handle variable-length inputs
  • Capture temporal patterns
  • Share parameters across time

That’s why RNNs were huge in:

  • Language modeling
  • Speech recognition
  • Time-series forecasting

The Big Problem with RNNs

Remember Chapter 20?

Yep vanishing & exploding gradients.

In long sequences:

Gradients vanish → early words forgotten

Gradients explode → unstable training

RNNs forget long-term context very easily.

LSTM & GRU Smarter RNNs

To fix this, we got:

LSTM (Long Short-Term Memory)
GRU (Gated Recurrent Unit)

They introduce gates:

  • What to remember
  • What to forget
  • What to pass forward

Think of them as RNNs with filters on memory.

RNNs in the Real World

You’ll find RNNs in:

  • Text generation
  • Machine translation
  • Speech-to-text
  • Time-series prediction
  • The Transition to Transformers

RNNs taught us:
Sequences need memory
Order matters

But they’re:
Slow (sequential processing)
Hard to train on long sequences

This opened the door for:
Transformers (coming soon)

What I Learned This Week

RNNs handle sequential data
They reuse the same network across time
Memory comes from hidden states
Vanishing gradients limit long-term memory
LSTM & GRU improved things

RNNs were the first models that truly remembered and they paved the way for everything that came after.

What's next

You already know what is next ie Transformers