functools.lru_cache: A Powerful Tool for Efficient Caching

Posted by Afsal on 08-Sep-2023

Hi Pythonistas!

Today we will learn about an easy but very efficient tool for caching. Caching is a common optimization technique used in computer science and software development to improve the performance of programs by storing and reusing previously computed results. In Python, the functools.lru_cache decorator provides a convenient way to implement caching, making it easier to optimize your code efficiently.

functools.lru_cache is a Python decorator for "Least Recently Used Cache." It's a built-in tool available in the Python standard library starting from Python 3.2. This decorator can be applied to functions to cache their return values based on their arguments, making subsequent calls to the same function with the same arguments faster by returning the cached result instead of re-computing it.

Under the hood, lru_cache maintains a fixed-size cache of function calls and their results. When you decorate a function with @lru_cache, Python automatically creates a cache object to store the function's return values. The cache holds the most recently used results, and when it reaches its maximum size, it evicts the least recently used values to make room for new entries.

Let us check this with an example. 

import functools 

@functools.lru_cache(maxsize=None)
def factorial(n):
    if n < 1:
        return 1
    return n * factorial(n-1)

Here we are adding a decorator functools.lru_cache max size is the size of the cache. If no maxsize is specified it will cache all the results. Now we can check the performance with and without lru cache

import functools
import timeit

def factorial(n):
    if n < 1:
        return 1
    return n * factorial(n-1)

@functools.lru_cache(maxsize=None)
def factorial_with_cache(n):
    if n < 1:
        return 1
    return n * factorial_with_cache(n-1)

def call_factorial():
    factorial(100)

def call_factorial_with_cache():
    factorial_with_cache(100)

timetaken = timeit.timeit(call_factorial, number=1000_000)
print("Time taken without cache", timetaken)

timetaken = timeit.timeit(call_factorial_with_cache, number=1000_000)
print("Time taken with cache", timetaken)

Output

Time taken without cache 7.262950392998391

Time taken with cache 0.07707665700036159

We can see that there is a huge difference between these two. But the effort needed is very simple.

Use cases

  • You have a computationally expensive function that you need to call multiple times with the same arguments.
  • The function's return values are deterministic, meaning they only depend on the function's arguments.
  • You want to reduce the time complexity of your code by avoiding redundant calculations.

I hope you have learned something from this post please share your valuable suggestions with afsal@parseltongue.co.in