How to efficiently operate a large numpy array

Question

I have a segment of codes which is based on a large numpy array and then to operate another array. Because this is a very large array, could you please let me know whether there is an efficient way to achieve my goal? (I think the efficient way should be achieved by directly operating on the array but not through the for-loop).

Thanks in advance, please below find my codes:

N = 1000000000
rand = np.random.rand(N)
beta = np.zeros(N)
for i in range(0, N):
    if rand[i] < 0.5:
        beta[i] = 2.0*rand[i]
    else:
        beta[i] = 1.0/(2.0*(1.0-rand[i]))

You can try scipy's sparse matrix. Checkout this [post](https://stackoverflow.com/questions/2540059/scipy-sparse-arrays). — Pinyi Wang, Oct 26 '18 at 21:14
For such a large matrix, this might still be too slow, but probably better: `np.where(rand < 0.5, 2.0*rand, 1.0/(2.0*(1.0-rand)))` — sacuL, Oct 26 '18 at 21:16
@EricWang, I don't see the relevance of sparse matrices to this problem `rand` doesn't have a high proportion of zeros, nor is `beta`. — hpaulj, Oct 27 '18 at 00:22
A faster method with nearly no changes to your code (I wouldn't initialize an array which is completely overwritten with zeros and adding @nb.njit(error_model="numpy)) would be https://stackoverflow.com/q/52046301/4045774 . — max9111, Oct 30 '18 at 17:34

score 2 · Accepted Answer · answered Oct 26 '18 at 21:25

You are here basically losing the efficiency of numpy, by performing the processing in Python. The idea of numpy is process the items in bulk, since it has efficiency algorithms in C++ behind the curtains that do the actual processing. You can see the Python end of numpy more as an "interface".

Now to answer your question, we can basically first construct an array of random numbers between 0 and 2, by multiplying it with 2 already:

rand = 2.0 * np.random.rand(N)

next we can use np.where(..) [numpy-doc] that acts like a conditional selector: we here pass it three "arrays": the first one is an array of booleans that encodes the truthiness of the "condition", the second is the array of values to fill in in case the related condition is true, and the third value is the array of values to plug in in case the condition is false, so we can then write it like:

N = 1000000000
rand = 2 * np.random.rand(N)
beta = np.where(rand < 1.0, rand, 1.0 / (2.0 - rand))

Thanks very much, based on the answers, there are two ways;1 ) using vectorize from numpy after I defined a function, and 2) using where. I have tested in my computer with N=1x10^8, for the first method, it takes 30 secs, where the second method takes 1.3 sec. If I did not multiply 2 first, it takes 1.9 secs. Thanks so so much. — Kevin Sun, Oct 26 '18 at 21:47

score 2 · Answer 2 · answered Oct 26 '18 at 21:26

N = 1000000000 caused a MemoryError for me. Reducing to 100 for a minimal example. You can use np.where routine.

In both case, fundamentally you are iterating over your array and applying a function. However, np.where uses a way faster loop (it's compiled code basically), while your "python" loop is interpreted and thus really slow for a big N.

Here's an example of implementation.

N = 100
rand = np.random.rand(N)
beta = np.where(rand < 0.5,  2.0 * rand, 1.0/(2.0*(1.0-rand))

score 1 · Answer 3 · answered Oct 26 '18 at 22:15

As other answers have pointed out, iterating over the elements of a numpy array in a Python loop should (and can) almost always be avoided. In most cases going from a Python loop to an array operation gives a speedup of ~100x.

However, if performance is absolutely critical, you can often squeeze out another factor of between 2x and 10x (in my experience) by using Cython. Here's an example:

%%cython
cimport numpy as np
import numpy as np
cimport cython
from cython cimport floating

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.cdivision(True)
cpdef np.ndarray[floating, ndim=1] beta(np.ndarray[floating, ndim=1] arr):
    cdef:
        Py_ssize_t i
        Py_ssize_t N = arr.shape[0]
        np.ndarray[floating, ndim=1] result = np.zeros(N)

    for i in range(N):
        if arr[i] < 0.5:
            result[i] = 2.0*arr[i]
        else:
            result[i] = 1.0/(2.0*(1.0-arr[i]))

    return result

You would then call it as beta(rand). As you can see, this allows you to use your original loop structure, but now using efficient typed native code. I get a speedup of ~2.5x compared to np.where.

It should be noted that in many cases this is not worth the extra effort compared to the one-liner in numpy -- but it may well be where performance is critical.

Thanks so much, this is also a good answer when I did not know the efficient use of Numpy. — Kevin Sun, Oct 26 '18 at 23:47

How to efficiently operate a large numpy array

3 Answers3

Linked