1

Is there a way to get rid of the loop in the code below and replace it with vectorized operation?

Given a data matrix, for each row I want to find the index of the minimal value that fits within ranges defined (per row) in a separate array.

Here's an example:

import numpy as np
np.random.seed(10)

# Values of interest, for this example a random 6 x 100 matrix
data = np.random.random((6,100))

# For each row, define an inclusive min/max range
ranges = np.array([[0.3, 0.4],
                   [0.35, 0.5],
                   [0.45, 0.6],
                   [0.52, 0.65],
                   [0.6,  0.8],
                   [0.75,  0.92]])


# For each row, find the index of the minimum value that fits inside the given range
result = np.zeros(6).astype(np.int)
for i in xrange(6):
    ind = np.where((ranges[i][0] <= data[i]) & (data[i] <= ranges[i][1]))[0]
    result[i] = ind[np.argmin(data[i,ind])]

print result
# Result: [35  8 22  8 34 78]

print data[np.arange(6),result]
# Result: [ 0.30070006  0.35065639  0.45784951  0.52885388  0.61393513  0.75449247]
Divakar
  • 218,885
  • 19
  • 262
  • 358
Fnord
  • 5,365
  • 4
  • 31
  • 48

2 Answers2

1

Approach #1 : Using broadcasting and np.minimum.reduceat -

mask = (ranges[:,None,0] <= data) & (data <= ranges[:,None,1])
r,c = np.nonzero(mask)
cut_idx = np.unique(r, return_index=1)[1]
out = np.minimum.reduceat(data[mask], cut_idx)

Improvement to avoid np.nonzero and compute cut_idx directly from mask :

cut_idx = np.concatenate(( [0], np.count_nonzero(mask[:-1],1).cumsum() ))

Approach #2 : Using broadcasting and filling invalid places with NaNs and then using np.nanargmin -

mask = (ranges[:,None,0] <= data) & (data <= ranges[:,None,1])
result = np.nanargmin(np.where(mask, data, np.nan), axis=1)
out = data[np.arange(6),result]

Approach #3 : If you are not iterating enough (just like you have a loop of 6 iterations in the sample), you might want to stick to a loop for memory efficiency, but make use of more efficient masking with a boolean array instead -

out = np.zeros(6)
for i in xrange(6):
    mask_i = (ranges[i,0] <= data[i]) & (data[i] <= ranges[i,1])
    out[i] = np.min(data[i,mask_i])

Approach #4 : There is one more loopy solution possible here. The idea would be to sort each row of data. Then, use the two range limits for each row to decide on the start and stop indices with help from np.searchsorted. Further, we would use those indices to slice and then get the minimum values. Benefit with slicing that way is, we would be working with views and as such would be very efficient, both on memory and performance.

The implementation would look something like this -

out = np.zeros(6)
sdata = np.sort(data, axis=1)
for i in xrange(6):
    start = np.searchsorted(sdata[i], ranges[i,0])
    stop = np.searchsorted(sdata[i], ranges[i,1], 'right')    
    out[i] = np.min(sdata[i,start:stop])

Furthermore, we could get those start, stop indices in a vectorized manner following an implementation of vectorized searchsorted.

Based on suggestion by @Daniel F for the case when we are dealing with ranges that are within the limits of given data, we could simply use the start indices -

out[i] = sdata[i, start]
Divakar
  • 218,885
  • 19
  • 262
  • 358
  • For #4: Isn't `sdata[i, np.searchsorted(sdata[i], ranges[i,0])]` already the minimum unless it's outside the bounds? Just `where` test that value and return `nan` if it's out of bounds. – Daniel F Sep 05 '17 at 10:01
  • @DanielF Not sure I got you. Why would `sdata[i, np.searchsorted(sdata[i], ranges[i,0])]` be minimum? In the sorted array `sdata` we are looking for the first index where `ranges[i,0]` would be to the left of it. That `stop` needs an edit though : `np.searchsorted(sdata[i], ranges[i,1], 'right') ` to cover for the first index to the right of such a sorted data. – Divakar Sep 05 '17 at 10:18
  • Because if if `sdata` is sorted, `min(sdata[start:stop])` will always be `sdata[start]`. Actually what you have now will throw an error if all values are below the lower bound, since `start` and `stop` will be `sdata.shape[1]` – Daniel F Sep 05 '17 at 10:23
  • @DanielF Ah yes! I got it. Edited post to cover that. Thanks! – Divakar Sep 05 '17 at 10:39
  • @DanielF If all values from `data` are below the range start, the original code won't work either and the expected o/p for such a case isn't discussed in the post. So, at this point I am assuming not to worry about such a case. – Divakar Sep 05 '17 at 10:41
1

Assuming at least one value in range, you don't even have to bother with the upper limit:

result = np.empty(6)
for i in xrange(6):
    lt = (ranges[i,0] >= data[i]).sum() 
    result[i] = np.argpartition(data[i], lt)[lt]

Actually, you could even vectorize the whole thing using argpartition

lt = (ranges[:,None,0] >= data).sum(1)
result = np.argpartition(data, lt)[np.arange(data.shape[0]), lt]

Of course, this is only efficient if data.shape[0] << data.shape[1], as otherwise you're basically sorting

Daniel F
  • 13,620
  • 2
  • 29
  • 55
  • i haven't had a chance to look for the problem but when running your first example i get an index out of bounds error at this line: `result[i] = np.argpartition(data, lt)[lt]`... second example worked. – Fnord Sep 05 '17 at 17:34
  • Yeah, fixed that. – Daniel F Sep 05 '17 at 19:13