0

Consider this performance test on Ipython under python 3:

Create a range, a range_iterator and a generator

In [1]: g1 = range(1000000)

In [2]: g2 = iter(range(1000000))

In [3]: g3 = (i for i in range(1000000))

Measure time for summing using python native sum

In [4]: %timeit sum(g1)
10 loops, best of 3: 47.4 ms per loop

In [5]: %timeit sum(g2)
The slowest run took 374430.34 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 123 ns per loop

In [6]: %timeit sum(g3)
The slowest run took 1302907.54 times longer than the fastest. This could mean that an intermediate result is being cached.
10000000 loops, best of 3: 128 ns per loop

Not sure if I should worry about the warning. The range version timing is vary long (why?), but the range_iterator and the generator are similar.

Now let's use numpy.sum

In [7]: import numpy as np

In [8]: %timeit np.sum(g1)
10 loops, best of 3: 174 ms per loop

In [9]: %timeit np.sum(g2)
The slowest run took 8.47 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 6.51 µs per loop

In [10]: %timeit np.sum(g3)
The slowest run took 9.59 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 446 ns per loop

g1 and g3 became x~3.5 slower, but the range_iterator g2 is now some ~50 times slower compared to the native sum. g3 wins.

In [11]: type(g1)
Out[11]: range

In [12]: type(g2)
Out[12]: range_iterator

In [13]: type(g3)
Out[13]: generator

Why such a penalty to range_iterator on numpy.sum? Should such objects be avoided? Does it generalized - Do "home made" generators always beat other objects on numpy?

EDIT 1: I realized that the np.sum does not evaluate the range_iterator but returns another range_iterator object. So this comparison is not good. Why doesn't it get evaluated?

EDIT 2: I also realized that numpy.sum keeps the range in integer form and accordingly gives the wrong results on my sum due to integer overflow.

In [12]: sum(range(1000000))
Out[12]: 499999500000

In [13]: np.sum(range(1000000))
Out[13]: 1783293664

In [14]: np.sum(range(1000000), dtype=float)
Out[14]: 499999500000.0

Intermediate conclusion - don't use numpy.sum on non numpy objects...?

Aguy
  • 7,851
  • 5
  • 31
  • 58
  • This is reiterating a previous similar question that was marked off-topic. Hopefully you'll find this one on-topic. – Aguy Jul 09 '16 at 06:04
  • Possible duplicate of [Why are Python's arrays slow?](http://stackoverflow.com/questions/36778568/why-are-pythons-arrays-slow) – styvane Jul 09 '16 at 07:08
  • 1
    @SSDMS - that answer is for `import array`, not `import numpy`. – hpaulj Jul 09 '16 at 07:12
  • Since you are using Ipython, use `np.sum??` to look at its code. Except for the special case of a `generator` (not iterator), `np.sum` operates on the `np.array` version of the input. So if `np.sum` behavior puzzles you, look at `np.array(arg)`. – hpaulj Jul 09 '16 at 19:16

1 Answers1

3

Did you look at the results of repeated sums on the iter?

95:~/mypy$ g2=iter(range(10))
96:~/mypy$ sum(g2)
Out[96]: 45
97:~/mypy$ sum(g2)
Out[97]: 0
98:~/mypy$ sum(g2)
Out[98]: 0

Why the 0s? Because g2 can be use only once. Same goes for the generator expression.

Or look at it with list

100:~/mypy$ g2=iter(range(10))
101:~/mypy$ list(g2)
Out[101]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
102:~/mypy$ list(g2)
Out[102]: []

In Python 3, range is a range object, not a list. So it's an iterator that regenerates each time it is used.

As for np.sum, np.sum(range(10)) has to make an array first.

When operating on a list, the Python sum is quite fast, faster than np.sum on the same:

116:~/mypy$ %%timeit x=list(range(10000))
       ...: sum(x)
1000 loops, best of 3: 202 µs per loop

117:~/mypy$ %%timeit x=list(range(10000))
       ...: np.sum(x)
1000 loops, best of 3: 1.62 ms per loop

But operating on an array, np.sum does much better

118:~/mypy$ %%timeit x=np.arange(10000)
       ...: sum(x)
100 loops, best of 3: 5.92 ms per loop

119:~/mypy$ %%timeit x=np.arange(10000)
       ...: np.sum(x)
<caching warning>
100000 loops, best of 3: 18.6 µs per loop

Another timing - various ways of making an array. fromiter can be faster than np.array; but the builtin arange is much better.

124:~/mypy$ timeit np.array(range(100000))
10 loops, best of 3: 39.2 ms per loop
125:~/mypy$ timeit np.fromiter(range(100000),int)
100 loops, best of 3: 12.9 ms per loop
126:~/mypy$ timeit np.arange(100000)
The slowest run took 6.93 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 106 µs per loop

Use range if you intend to work with lists; but use numpy's own range if you need to work with arrays. There is an overhead when creating arrays, so they are more valuable when working with large ones.

==================

On the question of how np.sum handles an iterator - it doesn't. Look at what np.array does to such an object:

In [12]: np.array(iter(range(10)))
Out[12]: array(<range_iterator object at 0xb5998f98>, dtype=object)

It produces a single element array with dtype object.

fromiter will evaluate this iterable:

In [13]: np.fromiter(iter(range(10)),int)
Out[13]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

np.array follows some complicated rules when it comes to converting the input to an array. It's designed to work primarily, with a list of numbers or nested equal length lists.

If you have questions of how a np function handles a non-array object, first check what np.array does to that object.

hpaulj
  • 221,503
  • 14
  • 230
  • 353