2

I have an array A, and I have a list of slicing indices (s,t), let's called this list L.

I want to find the 85 percentiles of A[s1:t1], A[s2:t2] ...

Is there a way to vectorize these operations in numpy?

ans = []
for (s,t) in L:
   ans.append( numpy.percentile( A[s:t], 85) ); 

looks cumbersome.

Thanks a lot!

PS: it's safe to assume s1 < s2 .... t1 < t2 ..... This is really just a sliding window percentile problem.

CuriousMind
  • 15,168
  • 20
  • 82
  • 120
  • 1
    What is the shape of `A`? If it's (n,) then would `t_k- s_k` be constant for all `k`? I.e. does your sliding window have a constant width? Thanks – eat Jul 29 '11 at 18:15
  • @eat: no my sliding window does not have a constant width, b/c the sample rate is not uniform unfortunately. The shape of A is one dimension though. – CuriousMind Jul 29 '11 at 18:58
  • @eat: I would also be interested in knowing if there is a vectorized algorithm for constant width sliding window – CuriousMind Jul 29 '11 at 19:12
  • 1
    Yes, there exists several ways to streamline the code if you have constant width. And, if you have really non-uniform sampled data, you can always re-sample it to be uniform (by interpolation, although you still need to specify the proper interpolation method). Care to elaborate more on your specific case? Thanks – eat Jul 29 '11 at 19:37
  • @eat: I am sorry I really can't interpolate the data. "sample" is not a good word. I am dealing with market data. you know, if a trade happens here, I really can't assume it happens elsewhere. =) – CuriousMind Jul 29 '11 at 21:16
  • 'market data'; well, I really do not know exactly what you mean. But FWIW, percentiles along some non-uniform intervals doesn't seem to be very straightforward, either. Care to really elaborate more on your specific problem. Thanks – eat Jul 29 '11 at 21:34

1 Answers1

1

Given that you're dealing with a non-uniform interval (i.e. the slices aren't the same size), no, there's no way to have numpy do it in a single function call.

If it was a uniform slice size, then you could do so with various tricks, as @eat commented.

However, what's wrong with a list comprehension? It's exactly equivalent to your loop above, but it looks "cleaner" if that's what you're worried about.

ans = [numpy.percentile(A[s:t], 85) for s,t in L]
Joe Kington
  • 275,208
  • 71
  • 604
  • 463