4

I have a (numpy) array representing a measurement curve. I am looking for the first index i following which the subsequent N elements satisfy some condition, e.g. lie within specific bounds. In pseudo code words I am looking for the minimal i such that

lower_bound < measurement[i:i+N] < higher_bound

is satisfied for all elements in the range.

Of course I could do the following:

for i in xrange(len(measurement) - N):
    test_vals = measurement[i:i + N]
    if all([True if lower_bound < x < higher_bound else False for x in test_vals]):
        return i

This is extremely inefficent as I am always comparing N values for every i. What is the most pythonic way to achieve this? Has Numpy some built-in functionalities to find this?

EDIT: As per request I provide some example input data

a = [1,2,3,4,5,5,6,7,8,5,4,5]
lower_bound = 3.5
upper_bound = 5.5 
N = 3

should return 3 as starting at a[3] the elements are within the bounds for at least 3 values.

stebu92
  • 313
  • 1
  • 2
  • 12

3 Answers3

3

One NumPythonic vectorized solution would be to create sliding windows across the entire length of the input array measurement stacked as a 2D array, then index into the array with those indices to form a 2D array version of measurement. Next, look for bound successes in one go with np.all(..axis=1) after bound checks. Finally choose the first success index as the output. The implementation would go something along these lines -

m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]
np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]

Sample run -

In [1]: measurement = np.array([1,2,3,4,5,5,6,7,8,5,4,5])
   ...: lower_bound = 3.5
   ...: higher_bound = 5.5 
   ...: N = 3
   ...: 

In [2]: m2D = measurement[np.arange(N) + np.arange(len(measurement)-N+1)[:,None]]

In [3]: m2D # Notice that is a 2D array (shifted) version of input
Out[3]: 
array([[1, 2, 3],
       [2, 3, 4],
       [3, 4, 5],
       [4, 5, 5],
       [5, 5, 6],
       [5, 6, 7],
       [6, 7, 8],
       [7, 8, 5],
       [8, 5, 4],
       [5, 4, 5]])

In [4]: np.nonzero(np.all((lower_bound < m2D) & (higher_bound > m2D),axis=1))[0][0]
Out[4]: 3
Divakar
  • 218,885
  • 19
  • 262
  • 358
2

If M is the length of a, here is a O(M) solution.

locations=(lower_bound<a) & (a<upper_bound)
cum=locations.cumsum()
lengths=np.roll(cum,-N)-cum==N
result=lengths.nonzero()[0][0]+1
B. M.
  • 18,243
  • 2
  • 35
  • 54
0

This answer could be helpful to you, although it is not specifically for numpy:

What is the best way to get the first item from an iterable matching a condition?

Community
  • 1
  • 1