I'd like to compare each value x
of an array with a rolling window of the n previous values. More precisely I'd like to see at which percentile this new value x
would be, if we added it to the previous window:
import numpy as np
A = np.array([1, 4, 9, 28, 28.5, 2, 283, 3.2, 7, 15])
print A
n = 4 # window width
for i in range(len(A)-n):
W = A[i:i+n]
x = A[i+n]
q = sum(W <= x) * 1.0 / n
print 'Value:', x, ' Window before this value:', W, ' Quantile:', q
[ 1. 4. 9. 28. 28.5 2. 283. 3.2 7. 15. ]
Value: 28.5 Window before this value: [ 1. 4. 9. 28.] Quantile: 1.0
Value: 2.0 Window before this value: [ 4. 9. 28. 28.5] Quantile: 0.0
Value: 283.0 Window before this value: [ 9. 28. 28.5 2. ] Quantile: 1.0
Value: 3.2 Window before this value: [ 28. 28.5 2. 283. ] Quantile: 0.25
Value: 7.0 Window before this value: [ 28.5 2. 283. 3.2] Quantile: 0.5
Value: 15.0 Window before this value: [ 2. 283. 3.2 7. ] Quantile: 0.75
Question: What is the name of this computation? Is there a clever numpy way to compute this more efficiently on arrays of millions of items (with n that can be ~5000)?
Note: here is a simulation for 1M items and n=5000 but it would take ~ 2 hours:
import numpy as np
A = np.random.random(1000*1000) # the following is not very interesting with a [0,1]
n = 5000 # uniform random variable, but anyway...
Q = np.zeros(len(A)-n)
for i in range(len(Q)):
Q[i] = sum(A[i:i+n] <= A[i+n]) * 1.0 / n
if i % 100 == 0:
print "%.2f %% already done. " % (i * 100.0 / len(A))
print Q
Note: this is not similar to How to compute moving (or rolling, if you will) percentile/quantile for a 1d array in numpy?