Optimizing Calculations with numpy and numba Python

Question

I am trying to make python run standard deviation functions faster with numba and numpy. However the problem is the for loop is very slow and I need alternatives so that I could make the code much faster. I iterated numba to the already existing numpy version however there is not much of a gain in performance. My original list_ has million of values within it thus it is taking a very long time to compute the standard deviation function. The list_ function down below is a very short numpy array that is meant to be an example for my problem as I wont be able to post the original list numbers. The for loop in the function below calculates the standard deviation of every nth number defined by the variable number in the list_ below. How would I be able to make this current function run faster.

import numpy as np
from numba import njit,jit,vectorize

number = 5
list_= np.array([457.334015,424.440002,394.795990,408.903992,398.821014,402.152008,435.790985,423.204987,411.574005,
404.424988,399.519989,377.181000,375.467010,386.944000,383.614990,375.071991,359.511993,328.865997,
320.510010,330.079010,336.187012,352.940002,365.026001,361.562012,362.299011,378.549011,390.414001,
400.869995,394.773010,382.556000])

Normal code:

def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    print(std)
std_()

Numba Code:

jitted_func = njit()(std_)
jitted_func()

performance results:

You can implement a simple for loop and calling inside `res[i]=np.std(a[i:i+window])`. If you compile this with Numba it is about as fast as numpy, without a memory overhead. A bit more advanced (and faster) would be to implement an algorithm for rolling standard deviation. eg. like described here https://jonisalonen.com/2014/efficient-and-accurate-rolling-standard-deviation/ — max9111, Jan 18 '21 at 14:18

Ananda · Answer 1 · 2021-01-17T06:04:33.877

You can do this in a vectorised fashion.

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    return std

std1 = np.std(rolling_window(list_, 5), axis=1)
print(np.allclose(std1[:-1], std_()))

Gives True. The code for rolling_window has been taken from this answer.

Comparison with numba -

import numpy as np
from numba import njit,jit,vectorize

number = 5
list_= np.random.rand(10000)

def rolling_window(a, window):
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)

def std_():
    std = np.array([list_[i:i+number].std() for i in range(0, len(list_)-number)])
    return std

%timeit np.std(rolling_window(list_, 5), axis=1)
%%timeit
jitted_func = njit()(std_)
jitted_func()

Gives

499 µs ± 3.98 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
106 ms ± 2.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Unfortunately it gives me a memory error `Unable to allocate 198. GiB for an array with shape (2659448, 10000) and data type float64`. As I have explained above that my main `list_` has over 2 million arrays in it. — tony selcuk, Jan 18 '21 at 00:48
Ananda, the OP has been stingy with information, but it appears that the `rolling_window` array is too large. The `as_strided` array is a `view`, a 'virtual' expansion of the source. But `np.std` has to subtract the `mean` from that, producing a full size array. `as_strided` arrays can easily blow up memory use. — hpaulj, Jan 18 '21 at 07:57

Optimizing Calculations with numpy and numba Python

1 Answers1

Linked