How array size affects numpy matrix operation execution time and CPU usage

Question

My question is about the following code:

%%time
import numpy as np
n_elems = 95
n_repeats = 100000
for i in range(n_repeats):
    X = np.random.rand(n_elems, n_elems)
    y = np.random.rand(n_elems)
    _ = X.dot(y)

I run this in iPython (version 6.2.1) with Python 3.5.5 and numpy version 1.14.0 on an 8-core machine.

I get the following output:

CPU times: user 8.93 s, sys: 439 ms, total: 9.37 s  
Wall time: 8.79 s

When n_elems is set between 1 and 95, the CPU and wall time are roughly equivalent. In addition, the CPU usage of the process (as seen using top) only goes up to 100%.

However, when n_elems is set to 96, I get the following:

CPU times: user 39.4 s, sys: 1min 28s, total: 2min 8s  
Wall time: 16.2 s

There is now a noticeable difference between the CPU and wall time. Also, the CPU usage reaches close to 800%.
Similar behaviour is observed for larger values of n_elems.

I think this is because at a certain array size the numpy operation becomes multithreaded.
Could someone clarify this?
Also is there a way to restrict CPU usage of the process to 100%.

8-cores here as well, but I can't see a significant difference of the timings between your cases. And CPU usage is 800% in both cases. — JE_Muc, May 23 '18 at 16:56
For my use case, I switched to using `np.einsum` (instead of `np.dot`) since this is single-threaded. There is a slight increase in execution time but it no longer uses multiple processors. — dtailor, May 25 '18 at 07:08

score 0 · Answer 1 · answered May 28 '18 at 11:37

It seems that some numpy operations (e.g. numpy.dot) use BLAS which can execute in parallel.

Other numpy operations (e.g. numpy.einsum) are implemented directly in C and execute serially.

See why isn't numpy.mean multithreaded? for more details.

To restrict execution of numpy.dot to a single thread irregardless of array size, I had to set the environment variable OMP_NUM_THREADS to 1 before importing numpy.

I found the following helpful:
How do you stop numpy from multithreading?
Limit number of threads in numpy

How array size affects numpy matrix operation execution time and CPU usage

1 Answers1