Python: function memoization

First a little question. What do you expect the following (Python 2) code to return?

def contrived(n, existing=[]):
    existing.append(n)
    return existing

print contrived(2)
print contrived(3)

Maybe try it and see? If the answer of “[2]”, followed by “[2, 3]” wasn’t what you had in mind then you need to understand a bit more about how Python handles default variables. Once you understand that initialisation of default variables is done once, at the time that the function is defined, things should make more sense.

The next question is when might this behaviour be useful? Well, how about function memoization? The idea is essentially keeping track of the results of expensive functions so that when the function is called again with the same arguments you don’t have to calculate it again. My favourite expensive function is the recursive implementation of the Fibonacci function.

def fib1(n):
    if n < 2:
        return 1

    return fib1(n - 1) + fib1(n - 2)

This is a really stupid way of calculating the n’th Fibonacci number, but serves nicely here. If I time how long it takes to calculate fib1(33) on my admittedly ancient desktop PC here, I find it takes 2.26 seconds to return the answer 5702887. If I make a second call in the same file, you won’t be surprised to find it takes the same time again. So, how about using a default variable to hold state between calls? Something like the following maybe.

def fib2(n, cache={}):
    if n in cache:
        return cache[n]

    value = fib2(n - 1) + fib2(n - 2)
    cache[n] = value
    return value

So now if I were to run fib2(33) two times in a row, you’d maybe expect it to take a little while to evaluate the first time and then come back almost instantly the second time? Well in this case it’s actually better than that because fib2 is recursive and the recursive calls themselves benefit from memoization. On my PC I get timings of 0.00 and 0.00 rounding to a hundredth of a second.

Well that’s nice I suppose, but what if the expensive function has already been written and you don’t want to have to make any changes to it? This is one of those cases where Python decorators are useful. Essentially you write a decorator function once that encapsulates the idea of memoization and then you can use it to transform any other function into a memoized function.

def memoize(fn, cache={}):
    def memoized(n):
        if n in cache:
            return cache[n]

        value = fn(n)
        cache[n] = value
        return value

    return memoized

At that point you could simply decorate the definition of fib1 with the memoize decorator, but just to prove the point that this can be done without any changes to fib1 at all, I’ll create another function that simply wraps fib1 and decorate that instead.

@memoize
def fib3(n):
    return fib1(n)

Voila! The naive, recursive definition of the Fibonacci function is transformed into a memoized function. More usefully, the memoize decorator can be used to do the same to any other function, including ones which might not be so easily optimised as this.

About Serif

An old skool Unix hacker, minus the beard and the sandals, who has currently found a nice warm place to spend his days at the offices of an ISP.
This entry was posted in programming. Bookmark the permalink.