From Basics to Bots: My Weekly AI Engineering Adventure - 2

Tiny Search Engine using Python

Posted by Afsal on 20-Aug-2025

Hi Pythonistas!

In the previous post we have learned the math behind the text matching. Today we are making own text search engine using that concept

What We’re Building

  •     Enter a search phrase.
  • The code finds and ranks your notes/lines/sentences by similarity.
  • The result? The most relevant lines pop up first, like magic (or math).

Step-by-Step: Tiny Search Engine in Python

Step 1: Gather Some Notes

notes = [
    "Hermione studied spells in the library",
    "Harry practiced flying on his broomstick",
    "Hagrid took care of the magical creatures",
    "Dumbledore gave Harry wise advice",
    "Ron played wizard chess in the common room"
]

Step 2: Grab the Needed Tools

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity

Step 3: Ask for a Search Query

query = input("Search for: ")

Step 4 : Stick the Query Onto Your Notes, and Vectorize

all_sentences = notes + [query]  # Search at the end
vectorizer = CountVectorizer()
vectors = vectorizer.fit_transform(all_sentences)

Step 5:  Compare Query to Every Note

sim_scores = cosine_similarity(vectors[-1], vectors[:-1]).flatten()
best_idx = sim_scores.argmax()

if best_idx == 0:
    print("No match found")
else:
    print("\nMost relevant note:")
    print(notes[best_idx])
    print(f"Similarity score: {sim_scores[best_idx]:.3f}")

Some Examples

Search for: harry

Most relevant note:
Dumbledore gave Harry wise advice
Similarity score: 0.447
Search for: broomstick

Most relevant note:
Harry practiced flying on his broomstick
Similarity score: 0.408
Search for: ronaldo
No match found

We’re using bag of words to turn text into count lists, and cosine similarity to find the closest matches. It’s fast, totally offline, and beginner-friendly, my kind of Python magic!

Up Next

In the upcoming post we learning about moving shapes with matrix math