When optimising slow parts of my code, I was surprised by the fact that A.sum()
is almost twice as fast as A.max()
:
In [1]: A = arange(10*20*30*40).reshape(10, 20, 30, 40)
In [2]: %timeit A.max()
1000 loops, best of 3: 216 us per loop
In [3]: %timeit A.sum()
10000 loops, best of 3: 119 us per loop
In [4]: %timeit A.any()
1000 loops, best of 3: 217 us per loop
I had expected that A.any()
would be much faster (it should need to check only one element!), followed by A.max()
, and that A.sum()
would be the slowest (sum()
needs to add numbers and update a value every time, max
needs to compare numbers every time and update sometimes, and I thought adding should be slower than comparing). In fact, it's the opposite. Why?