NumPy array computation time questions

Question

Sorry if this question is a little too basic.

I was learning NumPy and learned through Why are NumPy arrays so fast? that NumPy arrays are fast because of two things

np.array() are composed of the same dtype and hence get the benefits of locality of reference.
Vectorized operations are possible via things like uFunc.

Therefore, it stands to reason that the computation speed should be:

(uFuncs with np arrays)>(loops with np arrays)>(loops with list).

However, the code below showed that the speeds were actually,

(uFuncs with np arrays)>(loops with list)>(loops with np arrays).

in other words, although using uFUncs with np arrays is the fastest, loops with lists were faster than loops with np arrays. Could anyone explain why this is so?

Thank you in advance for your answer!

import numpy as np
np.random.seed(0)

arr=np.arange(1,1000)
arrlist=[]
for i in range(1,1000):
  arrlist.append(i)

#case0: list loop
def list_loop(values): #values: 1D array I want to take reciprocal of
  output=[]
  for i in range(len(values)):
    output.append(1.0/values[i])
  return output


#case1: np array loop
def nparray_loop(values): #values: 1D array I want to take reciprocal of
  output=np.empty(len(values))
  for i in range(len(values)):
    output[i]=1.0/values[i]
  return output

#case2: uFunc
def nparray_uFunc(values):
  return 1.0/values

print("list loop computation time:")
%timeit list_loop(arrlist)
print('\n')

print("np array loop computation time:")
%timeit nparray_loop(arr)
print('\n')

print("uFunc computation time:")
%timeit nparray_uFunc(arr)

Output:

list loop computation time:
185 µs ± 5.63 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)


np array loop computation time:
4.28 ms ± 402 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


uFunc computation time:
5.42 µs ± 1.23 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

The short answer is that `numpy` methods iterate in compiled code. Python lists contain references, so access to the elements is simple, while access to `ndarray` elements requires a more complex 'unboxing'. We try to avoid python iteration on arrays. — hpaulj, Sep 18 '21 at 06:51
That linked question is poorly written (too vague), and the answers are worse. Too much emphasis on 'locality' and low level 'vectorization'. Well written numpy is fast compared to interpretive python, not fast compared to good `c` code. — hpaulj, Sep 18 '21 at 14:39
Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. — Community, Sep 25 '21 at 09:01

score 1 · Accepted Answer · answered Sep 18 '21 at 05:57

A list is a represented internally as a vector of Python objects. A numpy array is a complex data structure in which the actual numeric data is stored in raw machine format.

To call a Python function on a python list, Python just plucks the appropriate element from the list and calls the function. To call a Python function on a numpy array, Python must take each individual raw element of the numpy array and convert it to the appropriate Python representation, and then call the Python function.

Numpy is fast for vector operations. If you're planning on looking at the data element by element, it is less efficient.

NumPy array computation time questions

1 Answers1