I have a Numpy array, and I need to find the N maximum product subarrays of M elements. For example, I have the array p = [0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5]
and I want to find the 5 highest product subarrays of 3 elements. Is there a "fast" way to do that?
Asked
Active
Viewed 135 times
3

Andrea Belle
- 95
- 1
- 4
-
That is not a numpy array – Mad Physicist May 20 '20 at 15:26
-
Did either of the posted solutions work for you? – Divakar Jun 06 '20 at 06:58
2 Answers
1
Approach #1
We can create sliding windows and then perform prod
reduction and finally np.argpartition
to get top N
ones among them -
from skimage.util.shape import view_as_windows
def topN_windowed_prod(a, W, N):
w = view_as_windows(a,W)
return w[w.prod(1).argpartition(-N)[-N:]]
Sample run -
In [2]: p = np.array([0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5])
In [3]: topN_windowed_prod(p, W=3, N=2)
Out[3]:
array([[0.8, 0.5, 0.7],
[0.5, 0.7, 0.9]])
Note that the order is not maintained with np.argpartition
. So, if we need the top N
in descending order of prod
values, use range(N)
with it. More info.
Approach #2
For smaller window lengths, we can simply slice and get our desired result, like so -
def topN_windowed_prod_with_slicing(a, W, N):
w = view_as_windows(a,W)
L = len(a)-W+1
acc = a[:L].copy()
for i in range(1,W):
acc *= a[i:i+L]
idx = acc.argpartition(-N)[-N:]
return w[idx]

Divakar
- 218,885
- 19
- 262
- 358
1
Here is another quick way to do it:
import numpy as np
p = [0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5]
n = 5
m = 3
# Cumulative product (starting with 1)
pc = np.cumprod(np.r_[1, p])
# Cumulative product of each window
w = pc[m:] / pc[:-m]
# Indices of the first element of top N windows
idx = np.argpartition(w, n)[-n:]
print(idx)
# [1 2 5 4 3]

jdehesa
- 58,456
- 7
- 77
- 121
-
1Idea is good. It's just that with decent sized arrays, and with fractions in it, it will end up with zeros at the backend. – Divakar May 20 '20 at 14:50
-
@Divakar Yes, that's a good point, if the array is big enough then precision is likely to suffer, and if it is not then performance is probably not a concern anyway. – jdehesa May 20 '20 at 15:02