Why is numpy ndarray much slower than lists for simple loops?

Question

I'm new to Python so I've decided to solve some common challenges to improve my knowledge of the language. I learned about numpy and its efficient ndarrays so I attempted the following experiment:

Consider the 2 sum problem (e.g. here) and let's solve it the naive way (it doesn't matter for the purpose of this question). Here's a solution with python's lists:

from  itertools import combinations

def twosum1(n_lst):
    pairs=list(combinations(n_lst,2))
    solutions=[]
    for pair in pairs:
        if sum(pair)==7: solutions.append(pair)
    return(solutions)

Then I created a version using np.arrays expecting it will drastically speed up the calculation:

from  itertools import combinations
import numpy as np

def twosum2(n_lst):
    pairs=np.array(list(combinations(n_lst,2)),dtype=int)
    return pairs[pairs[:,1]+pairs[:,0]==7]

However, after timing the two functions, twosum2 is about 2x slower than twosum1. So I thought that the problem maybe in the dynamical selection of elements, so I've written an exact copy of twosum1 by replacing lists with ndarrays ...

def twosum3(n_lst):
    pairs=np.array(list(combinations(n_lst,2)))
    solutions=np.empty((0,2))
    for pair in pairs:
        if np.sum(pair)==7: 
            solutions=np.append(solutions,[pair],axis=0)
    return(solutions)

... and the resulting function was 10x slower than the original!

How is this possible? What I'm I doing wrong here? Clearly, removing loops and replacing lists with ndarrays is not enough to gain speed (contrary to what I learned reading this).

Edit:

I use %timeit in jupyter to time the functions.
I take identical benchmarks for all the functions I'm timing.
The fact that I calculate combinations in the same way in the 3 functions tells me that the slowing down is due to numpy ... but don't see how.

How large is `n_lst`? There is some copying overhead in the NumPy solutions, when you create an array from a list. And could you also mention or show your timing methodology? — 9769953, Jan 15 '19 at 15:54
You are doing most of the work the same way in both functions: `list(combinations(n_lst,2))`. Adding a numpy wrapper after forcing the whole generator into memory is just clobbering your RAM for no good purpose. The actual comparison is not the bottleneck at all. — Mad Physicist, Jan 15 '19 at 15:55
This is a really good example of why you have to be mindful to get a boost from using numpy sometimes. — Mad Physicist, Jan 15 '19 at 15:57
You should expect normal loops to be slower with `numpy`, arrays an object is created and destroyed each iteration, or at least, whenever you access an item. This is similar to "boxing" that occurs in languages with primitive types. That being said, your tests are including the cost of materializing the combinations into their respective data structures. Also, "exact copy" uses `numpy.append` which has linear time complexity, and using it in a loop is quadratic time. Don't do that, it is horribly inefficient — juanpa.arrivillaga, Jan 15 '19 at 16:00
`solutions=np.append(solutions,[pair],axis=0)` is probably more expensive than using a list. As far as I know, numpy does not amortize reallocations. — Mad Physicist, Jan 15 '19 at 16:24
@MadPhysicist it definitely does not, `np.append` does not work in-place anyway. It always creates new arrays. — juanpa.arrivillaga, Jan 15 '19 at 17:25
`np.append` is just a confusing front end to `np.concatenate`. It should be deprecated. Building an array by repeated concatenate is slow. It's better to build a list and do one array construction at the end. — hpaulj, Jan 15 '19 at 17:28
@hpaulj doesn't that defy the purpose of working with np arrays in the first place? — xi45, Jan 15 '19 at 17:38
Once you have an array, using the compiled whole array methods is fast. — hpaulj, Jan 15 '19 at 19:09
@hpaulj Thanks for your input. Do you have a reference for "Building an array by repeated concatenate is slow. It's better to build a list and do one array construction at the end."? Esp. for large lists. — xi45, Jan 16 '19 at 10:20
Usually we offer timings as proof. And explanations about how list `append` is fast and in-place, while array `concatenate` has to make a new array each time. — hpaulj, Jan 16 '19 at 16:47

B. M. · Accepted Answer · 2019-01-15T17:28:26.317

3

The costly operation is np.array(list(combinations(n_lst,2)),dtype=int) because python must scan each member of the list, check if member is 'int compatible', convert it in integer and store it in the array.

To reach numpy performance, you must conceive all the algorithm in numpy. For example :

In [63]: n_lst=list(range(100))

In [64]: %timeit twosum1(n_lst)
11.2 ms ± 1.64 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [65]: np.vstack(np.where(np.add.outer(n_lst,n_lst)==7)).T
Out[65]: 
array([[0, 7],
       [1, 6],
       [2, 5],
       [3, 4],
       [4, 3],
       [5, 2],
       [6, 1],
       [7, 0]], dtype=int64)

In [66]: %timeit np.vstack(np.where(np.add.outer(n_lst,n_lst)==7)).T
306 µs ± 19 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

This way you will win a 30 to 100 factor , depending of the problem.

edited Jan 15 '19 at 17:28

answered Jan 15 '19 at 17:04

B. M.

18,243
2
35
54

Thanks for your answer. It's odd that numpy is systematically slower than just using python's list though. It makes me wonder about more complex problems since this is a trivial scenario. I also don't know what to think anymore about the text of jakevdp I linked to above. – xi45 Jan 15 '19 at 17:29
@xi45 So it's not that numpy is slower, it's that translating between numpy and python is slower. numpy's backend is written in C++ which allows it to perform so quickly, but translating back and forth will cause most of the performance penalties. – Michael Choi Jan 15 '19 at 17:32
@xi45 If you stick to numpy arrays during your calculations without "transforming" them from a numpy array to a python list back-and-forth, numpy will be several magnitudes faster, **especially for more complex problems**. – JE_Muc Jan 15 '19 at 17:35
@MichaelChoi Fine, but that what we expect to happen by writing the code in python. I mean, I'm coming to python from C after reading about the 'great' numpy and scipy. What do you think about the text I linked to? – xi45 Jan 15 '19 at 17:35
@Scotty1-what makes you say that? I actually do only one transformation to np.array (the combinations), the rest of the code is looping over a list vs. using np.arrays masks. – xi45 Jan 15 '19 at 17:36
1

@xi45 numpy and scipy definitely is great. Numerical operations are easy to write, easy to understand and have a great performance. Especially if you compare the performance of numpy to non-optimized C code... And as Mad Physicist pointed out, this *"only one transformation"* is exactly the problem. List do not have a homogenous type, whereas numpy arrays have a homogenous type (except for object dtype). **This is the bottleneck.** If you stick to numpy functions, your code will be several magnitudes faster. – JE_Muc Jan 15 '19 at 17:42
1

When profiling the performance of operations, you need to consider several pitfalls: Type conversions (one of the bottlenecks of python, so this problem is not numpy-specific), sample size, timing methods, etc... Just look at the answer of B. M. and notice how much faster his "pure numpy" method is compared to your python method. – JE_Muc Jan 15 '19 at 17:45
@Scotty1- Thank you for the clarifications. Using numpy functions alone does speed up the code, however mixing numpy with some other functionalities results in a significant slow down. In realistic, more complex, cases though ... I'm not sure I can do everything from and with numpy only. Any guidelines/refs to follow for consistent results? – xi45 Jan 15 '19 at 17:51
1

@xi45 You are welcome. Avoiding non-numpy functions is always a good idea when working with numpy. But for example most of the basic python operations like `+`, `+=` will be "translated" to the corresponding numpy functions automatically, so no need to worry about basic syntax. External modules that you have to import, also built-ins like `itertools`, in many cases cause type conversions. Just take a look at the module description if the specific module can handle numpy inputs and check if the output is also a numpy array. – JE_Muc Jan 15 '19 at 18:00

Why is numpy ndarray much slower than lists for simple loops?

1 Answers1

Linked