all() vs for loop with break performance

Question

I have found an interesting performance optimization. Instead of using all():

def matches(self, item):
    return all(c.applies(item) for c in self.conditions)

I have profiled that it's faster when a loop is used instead:

def matches(self, item):
    for condition in self.conditions:
        if not condition.applies(item):
            return False
    return True

With all() the profiler shows 1160 additional <genexpr> calls:

         4608 function calls (4600 primitive calls) in 0.015 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      580    0.002    0.000    0.008    0.000 rule.py:23(matches)
     1160    0.002    0.000    0.005    0.000 rule.py:28(<genexpr>)

With the for loop, there are no <genexpr> calls:

         2867 function calls (2859 primitive calls) in 0.012 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      580    0.002    0.000    0.006    0.000 rule.py:23(matches)

My question is where does the difference come from? My first though was that all evaluates all conditions, but that's not the case:

def foo():
    print('foo')
    return False

all(foo() for _ in range(1000))
foo

I believe this was recently asked and answered. Trying to find the duplicate... — Charles Duffy, Mar 30 '18 at 14:32
[Why simple for loop with simple if condition is faster than conditional generator expression](https://stackoverflow.com/questions/46209173/why-simple-for-loop-with-if-condition-is-faster-than-conditional-generator-expre) is close, but not what I'm looking for; reopened, trying to find the closer one. — Charles Duffy, Mar 30 '18 at 14:36
...[`for` loop vs `all()` execution speed](https://stackoverflow.com/questions/49341966/for-loop-vs-all-execution-speed#comment85684397_49341966) is what I was looking for. Can't close the same question twice, though, so I'll need to leave it to another gold-badge in the tag to decide if it's exact or not. — Charles Duffy, Mar 30 '18 at 14:38
Still it's interesting why so many generator expressions are being created in the first place? — matino, Mar 30 '18 at 14:41
What you should have is a whole bunch of `next()` calls on a single genexp, every time `all` is ready to retrieve the next generated item. Since we aren't seeing `next()` shown in your count, presumably that's what those *are*. — Charles Duffy, Mar 30 '18 at 14:44
I actually get mixed results with a similar example... On Python 3.6.4, my results show the generator version is usually, but not always faster, as judged by `timeit`. I wonder if the profiler affects these results. — sytech, Mar 30 '18 at 14:49
Possible duplicate of [for loop vs. all() execution speed](https://stackoverflow.com/questions/49341966/for-loop-vs-all-execution-speed) — Fred Larson, Mar 30 '18 at 14:53
580 + 580 = 1160. My guess is one call for `next` another for `all` to check the boolean. Whereas the for loop uses a statement, rather than a function call. Edit: using `dis.dis` seems that that `all` and the genexp generate 2 additional `CALL_FUNCTION`s comapred to the for loop. — sytech, Mar 30 '18 at 14:54
The thing is that all conditons evaluate to false, So why creating all generators? Shouldnt the first one fail and cause all to exit? — matino, Mar 30 '18 at 15:02
@matino, ahh -- that last comment clarifies this question quite a lot; yes, `all()` should short-circuit. Could you generate a [mcve] we can run ourselves, rather than asking us to trust that that's the behavior your `self.conditions` list has? (That assumption being false would quite explain what's given here). — Charles Duffy, Mar 30 '18 at 15:06
If `all()` didn't short-circuit, then `all(itertools.chain([False], itertools.cycle([True])))` would be an infinite loop. It isn't; thus, we know that something in the implementation doesn't behave the way you expect. — Charles Duffy, Mar 30 '18 at 15:11
You are right, I was mislead because the original code contains a loop above of the matches method and it iterates 580 Times. Now its clear, sorry for confusion. — matino, Mar 30 '18 at 15:15
@matino Is what you posted the full and complete output of the profiler? I think you may be calling `matches` 580 times? Each time it stops on the first iteration perhaps? — sytech, Mar 30 '18 at 15:15
@sytech please see my last comment, i got confused and provided false information. Its clear now to me - there is a loop calling 580 times matches method, thats why So many generators are created. — matino, Mar 30 '18 at 15:29
More closely related to the question, in light of its eventual resolution: [Is the shortcircuit behavior of Python's `any` and `all` explicit?](https://stackoverflow.com/questions/14730046/is-the-shortcircuit-behaviour-of-pythons-any-all-explicit) — Charles Duffy, Mar 30 '18 at 15:29

sytech · Accepted Answer · 2018-03-30T15:19:56.337

To get the next sequence in a generator, next is called on the generator object, so each iteration will have this call. The for loop does not incur this cost, so may be slightly faster in this case. Here, the generator does not do much of anything for you in the way of performance.

My first though was that all evaluates all conditions

all will stop as soon as a falsy value is encountered. In contrast any stops on the first truthy value; that may help frame the meaning of all better.

Consider the following functions

def genfunc():
    return all(i for i in range(1, 100))

def forfunc():
    for i in range(1, 100):
        if not i:
            return False
    return True

If we use dis.dis to see what's going on...

dis.dis(genfunc)
      0 LOAD_GLOBAL              0 (all)
      2 LOAD_CONST               1 (<code object <genexpr> at 0x04D4E5A0, file "<ipython-input-2-60c0c9eff4e2>", line 2>)
      4 LOAD_CONST               2 ('genfunc.<locals>.<genexpr>')
      6 MAKE_FUNCTION            0
      8 LOAD_GLOBAL              1 (range)
     10 LOAD_CONST               3 (1)
     12 LOAD_CONST               4 (100)
     14 CALL_FUNCTION            2
     16 GET_ITER
     18 CALL_FUNCTION            1
     20 CALL_FUNCTION            1
     22 RETURN_VALUE

and in the for loop version..

dis.dis(forfunc)
      0 SETUP_LOOP              26 (to 28)
      2 LOAD_GLOBAL              0 (range)
      4 LOAD_CONST               1 (1)
      6 LOAD_CONST               2 (100)
      8 CALL_FUNCTION            2
     10 GET_ITER
>>   12 FOR_ITER                12 (to 26)
     14 STORE_FAST               0 (i)

     16 LOAD_FAST                0 (i)
     18 POP_JUMP_IF_TRUE        12

     20 LOAD_CONST               3 (False)
     22 RETURN_VALUE
     24 JUMP_ABSOLUTE           12
>>   26 POP_BLOCK

>>   28 LOAD_CONST               4 (True)
     30 RETURN_VALUE

You'll note that in the generator expression version there are 2 additional function calls (CALL_FUNCTION). This accounts for the additional 1160 calls (2 calls for each of the 580 loops) you see in the generator expression.version.

yes, but. it depends how you call all. For example, `all([a(), b(), c()]` will first calculate `a()`, `b()` and `c()` and then pass the list of truthiness to all(). This can be more expensive than a written-out for loop which breaks at first calculated false. — pbuck, Mar 30 '18 at 15:32
Yes, the lazy evaluation is one way you actually reap the performance benefits of generators. However, that's not how it's being used in this case. — sytech, Mar 30 '18 at 15:34

all() vs for loop with break performance

1 Answers1