Python, fast computation of rolling percentile

Question

Given a multidimensional array, I want to compute a rolling percentile over one of its axes, with the rolling windows truncated near the boundaries of the array. Below is a minimal example implementation using only numpy via np.nanpercentile() applied to stacked, rolled (through np.roll()) arrays. However, the input array may be very large (~ 1 GB or more), so two issues arise:

For the current implementation, the stacked, rolled array may not fit into RAM memory. Avoidable with for-loops over all axes unaffected by the rolling, but may be slow.
Even fully vectorized (as below), the computation time is quite long, understandably due to the sheer amount of computations performed.

Questions: Is there a more efficient python implementation of a rolling percentile (with axis/axes argument or the like and with truncated windows near the boundaries)?** If not, how could the computation be sped up (and, if possible, without exceeding the RAM)? C-code called from Python? Computation of percentiles at fewer "central" points, and approximation in between via (e.g. linear) interpolation? Other ideas?

Related post (implementing rolling percentiles): How to compute moving (or rolling, if you will) percentile/quantile for a 1d array in numpy? Issues are:

pandas implementation via pd.Series().rolling().quantile() works only for pd.Series or pd.DataFrame objects, not multidimensional (4D or arbitrary D) arrays;
implementation via np.lib.stride_tricks.as_strided() with np.nanpercentile() is similar to the one below and should not be much faster given that np.nanpercentile() is the speed bottleneck, see below

Minimal example implementation:

import numpy as np

np.random.seed(100)

# random array of numbers
a = np.random.rand(10000,1,70,70)
# size of rolling window
n_window = 150
# percentile to compute
p = 0.7
# NaN values to prepend/append to array before rolling
nan_temp = np.full(tuple([n_window] + list(np.array(a.shape)[1:])), fill_value=np.nan)
# prepend and append NaN values to array
a_temp = np.concatenate((nan_temp, a, nan_temp), axis=0)
# roll array, stack rolled arrays along new dimension, compute percentile (ignoring NaNs) using np.nanpercentile()
res = np.nanpercentile(np.concatenate([np.roll(a_temp, shift=i, axis=0)[...,None] for i in range(-n_window, n_window+1)],axis=-1),p*100,axis=-1)
# cut away the prepended/appended NaN values
res = res[n_window:-n_window]

Computation times (in seconds), example (for the case of a having a shape of (1000,1,70,70) instead of (10000,1,70,70)):

create random array: 0.0688176155090332
prepend/append NaN values: 0.03478217124938965
stack rolled arrays: 38.17830514907837
compute nanpercentile: 1145.1418626308441
cut out result: 0.0004646778106689453

@MarkLavin: Is there such code for the running percentile as well? — bproxauf, Jul 05 '21 at 19:10
googling "compute windowed percentile" I get a pointer to https://stackoverflow.com/questions/47585465/how-to-compute-moving-or-rolling-if-you-will-percentile-quantile-for-a-1d-arr — Mark Lavin, Jul 05 '21 at 19:33
I had already linked that post to my question beforehand, but I updated the corresponding paragraph with some reasons for why these implementations are not suitable for the case given here. — bproxauf, Jul 06 '21 at 06:31

Python, fast computation of rolling percentile

0 Answers0