Hi Pythonistas!
So far, we’ve learned two critical things:
- ChatGPT predicts the next token
- Tokens are just numbers
Now a big question appears:
If everything is just numbers… where does meaning come from? That's where embeddings enter the story.
Numbers Alone Mean Nothing
A token ID like 317 or 42 has no meaning by itself. They’re just labels.
If the model treated them as plain numbers, learning would be impossible.
So we need a smarter representation.
What Is an Embedding?
An embedding is a vector. Instead of representing a token as a single number, we represent it as a list of numbers.
Example
'king' → [0.21, -0.34, 0.87, ...]
'queen' → [0.19, -0.31, 0.90, ...]
These vectors live in a high-dimensional space.Meaning emerges from position, not labels. Meaning Comes from Distance
In embedding space:
Similar words → close together
Unrelated words → far apart
So:
"king" is close to "queen"
"cat" is close to "dog"
"cat" is far from "democracy"
The model doesn’t know meanings.It learns relationships.
How Are Embeddings Learned?
They are not handcrafted. They are learned during training.
During training:
- The model makes a prediction
- It makes a mistake
- Gradients adjust the embedding vectors
Slowly:
- Useful relationships strengthen
- Useless ones fade
Embeddings evolve to support prediction.
Asking "what does this word mean?" becomes: Where does this vector sit in space?
Same Token, Different Meaning?
Here’s a subtle but important point.
Early models had static embeddings: One embedding per word Same meaning everywhere
Modern models (like Transformers) create contextual embeddings.
Meaning depends on surrounding tokens.
"bank" in river bank
"bank" in money bank
Same token.
Different embedding.
Different meaning.
Embeddings Are the Model’s Reality
After this step:
The model no longer deals with text
It doesn’t deal with token IDs
It only sees vectors
What I Learned This Week
- Token IDs are just labels
- Embeddings turn tokens into vectors
- Meaning comes from position in space
- Similar ideas cluster together
- Context changes embeddings
What's Coming Next
Next we will learn about attention