The Runcible Blog

Is exception handling slow?

Ned's post about "human sorting" lists lead to the inevitable comments from people who want to write the most succinct solution to the problem. Among them was the remark that "exception handling is slow." Ned asked me why it was assumed that exception handling was so slow, and I replied that it isn't (based on some of the remarks about improvements in exception handling in Python 2.5). I had no real evidence of the overhead from catching exceptions, so I thought I'd benchmark the different sorting methods

First, Ned's:

def tryint(s):
    try:
        return int(s)
    except:
        return s

def ak1(s):
    return [ tryint(c) for c in re.split('([0-9]+)', s) ]

Then, using str.isdigit instead:

def safeint(s):
    if s.isdigit():
        return int(s)
    else:
        return s

def ak2(s):
    return [ safeint(c) for c in re.split('([0-9]+)', s) ]

Somebody mentioned the new syntax in 2.5 that allows one-liner conditionals, thus saving a function call:

def ak3(s):
    return [ int(c) if c.isdigit() else c for c in re.split('([0-9]+)', s) ]

Of course, there's the easy optimization of compiling the regular expression beforehand:

NUM_RE = re.compile('([0-9]+)')
def ak4(s):
    return [ int(c) if c.isdigit() else c for c in NUM_RE.split(s) ]

Ok, so what were the results? Here, I'm sorting a 5000 item list of strings created like so:

def make_list(sise): 
    samp = string.letters + string.digits
    return [''.join(random.sample(samp, 4)) for i in xrange(size)]

for each function:
timeit.py -s "import sorting; td=sorting.make_list(5000)" "td.sort(key=sorting.ak1)"
Results:
ak1: 80.7 msec
ak2: 37 msec
ak3: 34 msec
ak4: 18.1 msec

It turns out the conventional wisdom isn't just an old pythonic wive's tale: exception handling is slow. Even though ak1 would've been my first thought for the function (hanging around with Ned too much?) because it seems more pythonic to just try to convert the value to an int and catch the exception if it's not an integer, that function is 4 times slower than the fastest case. Also, this benchmark shows that there isn't much of a gain from the new conditional syntax, even though a function call is saved.

As usual, pre-compiling the regex makes lots of sense for performance-critical code. For giggles, I have a function (that I suppose works...) which avoids the regular expression:

def ak5(s):
    new_s = []
    b = ''
    for c in s:
        oc = ord(c)
        if oc > 57 or oc < 48:
            if b:
                new_s.append(int(b))
                b = ''
            new_s.append(c)
        else:
            b += c
    return new_s

It's not as slow as I thought, averaging 25.5 msecs on my machine.

What have we learned today?

  • Python exception handling isn't nearly as lightweight as I thought. Especially avoid it in tight loops.
  • Function calls should be minimized. (duh)
  • Sometimes the more succinct solution isn't faster.

  • Compile common regular expressions. (duh)

Comments

By: Michael Johnson on December 18, 2007 at 4:38 p.m.

Exception handling is (relatively) slow in any language that implements them. This is because when you throw an exception the target code has to unwind the call stack while still preserving the exception. Unwinding the call stack is a complex problem, thus slow.

That's why the common wisdom is to use exceptions for the exceptional conditions they're designed for. In those cases, the the performance penalty of walking through the call stack is an acceptable trade off with the convenience of throwing an exception.

By: Masklinn on December 13, 2007 at 6:04 a.m.

Note: the problem here is that the most common case is having the exception triggered. That's not exactly the most pythonic thing to do, we usually want exceptions when they aren't the most common case (because entering the `try` is pretty much free, exception handling is only costly when exceptions are raised). And regular boolean test when they are (or try to change the case where exceptions are triggered)