1

Given the following three functions

def v1(a):
    c = 0
    for a_ in a:
        if a_ is not None:
            c += 1
    return c

def v2(a):
    c = 0
    for a_ in a:
        if a_:
            c += 1
    return c

def v3(a):
    c = 0
    for a_ in a:
        if bool(a_):
            c += 1
    return c

I get the following performance (I'm using python 3.6 on ubuntu 18.04)

values = [random.choice([1, None]) for _ in range(100000)]

%timeit v1(values)
3.35 ms ± 28 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit v2(values)
2.83 ms ± 36.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit v3(values)
12.3 ms ± 59.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The similar performance between v1 and v2 makes sense, but why is v3 so much slower given that v2 is presumably implicitly calling bool(a_) too?

Is it simply calling bool() from python rather than from c (as I assume if does) that's causing the difference in performance?

SColvin
  • 11,584
  • 6
  • 57
  • 71
  • Because built-ins... even `x` is faster than `len(x) != 0` or `x != []`. (in some question) – user202729 Apr 09 '18 at 14:44
  • (what about `c += bool(a)`?) – user202729 Apr 09 '18 at 14:45
  • 1
    I believe `bool()` will actually instantiate a bool class object and then do the check which would lead to a performance hit – quikst3r Apr 09 '18 at 14:45
  • 2
    [Here it is](https://stackoverflow.com/a/45778282/5267751). – user202729 Apr 09 '18 at 14:45
  • `bool()` is a class or type constructor, it is generic and not optimised for this task, related: https://stackoverflow.com/questions/49009870/what-are-the-differences-between-bool-and-operator-truth – Chris_Rands Apr 09 '18 at 14:52
  • Possible duplicate of [How do I check if a list is empty?](https://stackoverflow.com/questions/53513/how-do-i-check-if-a-list-is-empty) – mbrig Apr 09 '18 at 15:20
  • @mbrig That is not at all a duplicate just because an answer there touches upon it. – miradulo Apr 09 '18 at 16:44
  • @mbrig this is not at all the same question. Chris_Rands's linked question is much closer but still not a duplicate. – SColvin Apr 09 '18 at 17:58

2 Answers2

3

This is mainly due to Python's dynamicism and the fact that you have a Python level call.

Using bool Python can't directly go and construct a new bool object. It has to do look ups to find what exactly is attached to bool; then it has check if it is something that can be called, parse its arguments and then call it.

Using a construct such as if _a, has a defined meaning. It goes through a specific OPCODE (POP_JUMP_IF_FALSE here) and checks if the loaded value has a truthy value. Way less hoops to jump through.

bool calls the same function to check if a value supplied is True or False, it just has a longer trip until it gets there.

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
2

v2 is able to evaluate the "truthiness" of a_ in the interpreter:

 >>> dis.dis(v2)
 ...
 11          14 LOAD_FAST                2 (a_)
             16 POP_JUMP_IF_FALSE       10
 ...

where v3 is required to actually call bool at the Python level:

>>> dis.dis(v3)
...
18          14 LOAD_GLOBAL              0 (bool)
            16 LOAD_FAST                2 (a_)
            18 CALL_FUNCTION            1
            20 POP_JUMP_IF_FALSE       10
...

The function call is what slows v3 down.

chepner
  • 497,756
  • 71
  • 530
  • 681