Find range of length N in a numpy array where some condition is satisfied

Question

I have a (numpy) array representing a measurement curve. I am looking for the first index i following which the subsequent N elements satisfy some condition, e.g. lie within specific bounds. In pseudo code words I am looking for the minimal i such that

lower_bound < measurement[i:i+N] < higher_bound

is satisfied for all elements in the range.

Of course I could do the following:

for i in xrange(len(measurement) - N):
    test_vals = measurement[i:i + N]
    if all([True if lower_bound < x < higher_bound else False for x in test_vals]):
        return i

This is extremely inefficent as I am always comparing N values for every i. What is the most pythonic way to achieve this? Has Numpy some built-in functionalities to find this?

EDIT: As per request I provide some example input data

a = [1,2,3,4,5,5,6,7,8,5,4,5]
lower_bound = 3.5
upper_bound = 5.5 
N = 3

should return 3 as starting at a[3] the elements are within the bounds for at least 3 values.

as a first optimization, when x is not in the the bounds, you can start the next test_vals at measurement[i+index(x)+1] — VirgileD, Nov 13 '15 at 10:11

Divakar · Accepted Answer · 2015-11-13T10:24:43.367

One NumPythonic vectorized solution would be to create sliding windows across the entire length of the input array measurement stacked as a 2D array, then index into the array with those indices to form a 2D array version of measurement. Next, look for bound successes in one go with np.all(..axis=1) after bound checks. Finally choose the first success index as the output. The implementation would go something along these lines -

m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]
np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]

Sample run -

In [1]: measurement = np.array([1,2,3,4,5,5,6,7,8,5,4,5])
   ...: lower_bound = 3.5
   ...: higher_bound = 5.5 
   ...: N = 3
   ...: 

In [2]: m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]

In [3]: m2D # Notice that is a 2D array (shifted) version of input
Out[3]: 
array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5],
       [4, 5, 5],
       [5, 5, 6],
       [5, 6, 7],
       [6, 7, 8],
       [7, 8, 5],
       [8, 5, 4],
       [5, 4, 5]])

In [4]: np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]
Out[4]: 3

Thanks, this is what I was looking for! – stebu92 Nov 13 '15 at 12:35 — stebu92, Nov 13 '15 at 12:35

B. M. · Answer 2 · 2015-11-13T13:55:42.243

2

If M is the length of a, here is a O(M) solution.

locations=(lower_bound<a) & (a<upper_bound)
cum=locations.cumsum()
lengths=np.roll(cum,-N)-cum==N
result=lengths.nonzero()[0][0]+1

edited Nov 13 '15 at 13:55

answered Nov 13 '15 at 13:47

B. M.

18,243
2
35
54

score 0 · Answer 3 · edited May 23 '17 at 12:15

0

This answer could be helpful to you, although it is not specifically for numpy:

What is the best way to get the first item from an iterable matching a condition?

edited May 23 '17 at 12:15

Community

1
1

answered Nov 13 '15 at 11:04

Florent Chatterji

355
3
17

Find range of length N in a numpy array where some condition is satisfied

3 Answers3