From Basics to Bots: My Weekly AI Engineering Adventure-30

Embeddings - Turning Tokens into Meaningful Space

Posted by Afsal on 13-Mar-2026

Hi Pythonistas!

So far, we’ve learned two critical things:

  • ChatGPT predicts the next token
  • Tokens are just numbers

Now a big question appears:

If everything is just numbers… where does meaning come from? That's where embeddings enter the story.

Numbers Alone Mean Nothing

A token ID like 317 or 42 has no meaning by itself. They’re just labels.
If the model treated them as plain numbers, learning would be impossible.
So we need a smarter representation.

What Is an Embedding?

An embedding is a vector. Instead of representing a token as a single number, we represent it as a list of numbers.

Example

'king' → [0.21, -0.34, 0.87, ...]

'queen' → [0.19, -0.31, 0.90, ...]

These vectors live in a high-dimensional space.Meaning emerges from position, not labels. Meaning Comes from Distance

In embedding space:

Similar words → close together

Unrelated words → far apart

So:
"king" is close to "queen"
"cat" is close to "dog"
"cat" is far from "democracy"

The model doesn’t know meanings.It learns relationships.

How Are Embeddings Learned?

They are not handcrafted. They are learned during training.

During training:

  • The model makes a prediction
  • It makes a mistake
  • Gradients adjust the embedding vectors

Slowly:

  • Useful relationships strengthen
  • Useless ones fade

Embeddings evolve to support prediction.

Asking "what does this word mean?" becomes: Where does this vector sit in space?

Same Token, Different Meaning?

Here’s a subtle but important point.
Early models had static embeddings: One embedding per word Same meaning everywhere

Modern models (like Transformers) create contextual embeddings.
Meaning depends on surrounding tokens.

"bank" in river bank
"bank" in money bank

Same token.
 Different embedding.
 Different meaning.

Embeddings Are the Model’s Reality

After this step:

The model no longer deals with text
It doesn’t deal with token IDs
It only sees vectors

What I Learned This Week

  • Token IDs are just labels
  • Embeddings turn tokens into vectors
  • Meaning comes from position in space
  • Similar ideas cluster together
  • Context changes embeddings

What's Coming Next

Next we will learn about attention