Why evaluating boolean objects takes time in python

Question

I compared these two code snippets using the timeit module and realized the second one is slightly faster:

~$ python -m timeit —setup "l=[1, 2];k=1" "l[k==1]"
10000000 loops, best of 3: 0.0414 usec per loop
~$ python -m timeit —setup "l=[1, 2];k=1" "l[0 if k==1 else 1]"
10000000 loops, best of 3: 0.0372 usec per loop

Since the logic is the same I thought evaluating boolean objects takes more time than integer equivalence (True == 1 and False == 0), therefore I came up with the following benchmark and it turns out that I was correct:

~$ python -m timeit —setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0411 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0394 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0416 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[True]"
10000000 loops, best of 3: 0.0428 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[True]"
10000000 loops, best of 3: 0.0394 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[True]"
10000000 loops, best of 3: 0.0393 usec per loop
~$ 
~$
~$ python -m timeit —setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0232 usec per loop
~$ python -m timeit —setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0232 usec per loop

But I don't know what the underlying reason is. I mean why evaluating True and False takes more time? Also I noticed another mysterious thing during the benchmark. In the first part of the benchmark there is variation in the results while the numbers for the second one is stable.

The difference between `boolean` and `integer` equivalence [was already asked here](http://stackoverflow.com/questions/31459623/why-does-my-sieve-of-eratosthenes-work-faster-with-integers-than-with-booleans). — Nander Speerstra, Feb 10 '16 at 16:26
Yes, thanks. But that thread doesn't discuss the instability of the numbers with regards to True/False evaluation. — Omid, Feb 10 '16 at 16:32
Instability is easy: cache misses. Since you need to do a lookup there's no guarantee that you hit it in cache or you need to go to ram to get it or that is remains in cache while the benchmark is running. So instability. — Sorin, Feb 10 '16 at 16:45

user2357112 · Accepted Answer · 2016-02-10T17:08:28.743

For l[k==1] and l[0 if k==1 else 1], you didn't time it long enough. The difference you saw is within what you'd get from random variation. I'm not sure which form is ultimately faster, but a longer trial showed the opposite effect:

>>> timeit.timeit('l[k==1]', 'l=[1,2];k=1', number=100000000)
10.782931089401245
>>> timeit.timeit('l[0 if k==1 else 1]', 'l=[1,2];k=1', number=100000000)
11.140317916870117

l[0 if k==1 else 1] is unexpectedly competitive most likely because l[k==1] doesn't hit the fast path for the BINARY_SUBSCR opcode:

TARGET_NOARG(BINARY_SUBSCR)
{
    w = POP();
    v = TOP();
    if (PyList_CheckExact(v) && PyInt_CheckExact(w)) {
        /* INLINE: list[int] */
        Py_ssize_t i = PyInt_AsSsize_t(w);
        if (i < 0)
            i += PyList_GET_SIZE(v);
        if (i >= 0 && i < PyList_GET_SIZE(v)) {
            x = PyList_GET_ITEM(v, i);
            Py_INCREF(x);
        }
        else
            goto slow_get;
    }
    else
      slow_get:
        x = PyObject_GetItem(v, w);

In your second test, there's the additional factor that in Python 2, True is a built-in variable lookup, while 1 is a much faster LOAD_CONST. LOAD_CONST only indexes into the code object's co_consts tuple, while a built-in lookup takes two dict lookups.

score 0 · Answer 2 · edited May 23 '17 at 11:59

The difference between boolean and integer division have been asked earlier. However, the (in)stability of it isn't discussed. Below, my scores:

Python2

~$ python2 -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0366 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0332 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[1]"
10000000 loops, best of 3: 0.0193 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[1]"
100000000 loops, best of 3: 0.0194 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[1]"
100000000 loops, best of 3: 0.0195 usec per loop
~$ python2 -m timeit --setup "l=range(1000)" "l[0]"
100000000 loops, best of 3: 0.0196 usec per loop

Python 3

~$ python -m timeit --setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0712 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.072 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[0]"
10000000 loops, best of 3: 0.0719 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.082 usec per loop
~$ python -m timeit --setup "l=range(1000)" "l[False]"
10000000 loops, best of 3: 0.0821 usec per loop

Funny thing is: my score alter not only between Python versions, but also within Python versions. Due to cache misses, differences are logical. Funny thing at your scores is that the differences at the 0 and 1 are so small, you cannot see it with 4 decimals... (I'm using a virtual machine, so this may slow my system enough to make it easy to see the difference)

Why evaluating boolean objects takes time in python

2 Answers2