15

In numpy, I would like to detect the points at which the signal crosses from (having been previously) below a certain threshold, to being above a certain other threshold. This is for things like debouncing, or accurate zero crossings in the presence of noise, etc.

Like this:

import numpy

# set up little test problem
N = 1000
values = numpy.sin(numpy.linspace(0, 20, N))
values += 0.4 * numpy.random.random(N) - 0.2
v_high = 0.3
v_low = -0.3

# find transitions from below v_low to above v_high    
transitions = numpy.zeros_like(values, dtype=numpy.bool)

state = "high"

for i in range(N):
    if values[i] > v_high:
        # previous state was low, this is a low-to-high transition
        if state == "low":
            transitions[i] = True
        state = "high"
    if values[i] < v_low:
        state = "low"

I would like a way to do this without looping over the array explicitly: but I can't think of any way, since each state value depends on the previous state. Is it possible to do without a loop?

Alex I
  • 19,689
  • 9
  • 86
  • 158

3 Answers3

20

This can be done like so:

def hyst(x, th_lo, th_hi, initial = False):
    hi = x >= th_hi
    lo_or_hi = (x <= th_lo) | hi
    ind = np.nonzero(lo_or_hi)[0]
    if not ind.size: # prevent index error if ind is empty
        return np.zeros_like(x, dtype=bool) | initial
    cnt = np.cumsum(lo_or_hi) # from 0 to len(ind)
    return np.where(cnt, hi[ind[cnt-1]], initial)

Explanation: ind are the indices of all the samples where the signal is below the lower or above the upper threshold, and for which the position of the 'switch' is thus well-defined. With cumsum, you make some sort of counter which points to the index of the last well-defined sample. If the start of the input vector is between the two thresholds, cnt will be 0, so you need to set the the corresponding output to the initial value using the where function.

Credit: this is a trick I found in an old post on some Matlab forum, which I translated to Numpy. This code is a bit hard to understand and also needs to allocate various intermediate arrays. It would be better if Numpy would include a dedicated function, similar to your simple for-loop, but implemented in C for speed.

Quick test:

x = np.linspace(0,20, 1000)
y = np.sin(x)
h1 = hyst(y, -0.5, 0.5)
h2 = hyst(y, -0.5, 0.5, True)
plt.plot(x, y, x, -0.5 + h1, x, -0.5 + h2)
plt.legend(('input', 'output, start=0', 'output, start=1'))
plt.title('Thresholding with hysteresis')
plt.show()

Result: enter image description here

Etienne Dechamps
  • 24,037
  • 4
  • 32
  • 31
Bas Swinckels
  • 18,095
  • 3
  • 45
  • 62
  • @SpaceDog You could, but if you are using C it is probably better to write a simple loop similar to what was in the original question. The trick in my answer is faster in Python, since it uses vectorized numpy code, instead of a slow Python loop. The vectorized code has to pass several times over the data, while a simple loop in C could do everything in one pass. – Bas Swinckels Dec 30 '15 at 11:31
  • I asked that because I am trying to understand what your function does, so I can write a code in C... – Duck Dec 31 '15 at 01:22
  • Thank you! This solution is using indices in a creative way and reminds me of APL. – Tobia Mar 05 '22 at 19:47
  • This solution is incredibly clever, it took me a while to wrap my head around how this works. The key is to understand that there are two layers of indirection going on - `cnt-1` is an array of indices into `ind` which itself is an array of indices into `lo_or_hi`. `cnt` can be seen as "the number of outside values we've seen so far" and since `ind` is the list of indices of outside values, `cnt-1` de facto yields the *index* of the last outside value *index* to "latch on to". Very mind-bending. – Etienne Dechamps Aug 06 '23 at 22:02
3

Modifications I had to do for my work, all based on the answer above by Bas Swinckels, to permit detection of threshold-crossing when using standard as well as reversed thresholds.

I'm not happy with the naming tough, maybe it should now read th_hi2lo and th_lo2hi instead of th_lo and th_hi? Using the original values, the behaviour ist the same tough.

def hyst(x, th_lo, th_hi, initial = False):
    """
    x : Numpy Array
        Series to apply hysteresis to.
    th_lo : float or int
        Below this threshold the value of hyst will be False (0).
    th_hi : float or int
        Above this threshold the value of hyst will be True (1).
    """        

    if th_lo > th_hi: # If thresholds are reversed, x must be reversed as well
        x = x[::-1]
        th_lo, th_hi = th_hi, th_lo
        rev = True
    else:
        rev = False

    hi = x >= th_hi
    lo_or_hi = (x <= th_lo) | hi

    ind = np.nonzero(lo_or_hi)[0]  # Index für alle darunter oder darüber
    if not ind.size:  # prevent index error if ind is empty
        x_hyst = np.zeros_like(x, dtype=bool) | initial
    else:
        cnt = np.cumsum(lo_or_hi)  # from 0 to len(x)
        x_hyst = np.where(cnt, hi[ind[cnt-1]], initial)

    if rev:
        x_hyst = x_hyst[::-1]

    return x_hyst

And as above a test of the code to see what it does:

x = np.linspace(0,20, 1000)
y = np.sin(x)
h1 = hyst(y, -0.2, 0.2)
h2 = hyst(y, +0.5, -0.5)
plt.plot(x, y, x, -0.2 + h1*0.4, x, -0.5 + h2)
plt.legend(('input', 'output, classic, hyst(y, -0.2, +0.2)', 
            'output, reversed, hyst(y, +0.5, -0.5)'))
plt.title('Thresholding with hysteresis')
plt.show()

Sine with two different settings for hysteresis.

AMTQ
  • 176
  • 1
  • 1
  • 12
1

Here's a solution that achieves the exact same result as the answer from Bas Swinckels, and appears to have similar performance (in the order of 0.4 seconds for an array of 10 million elements when testing on Colab), but is easier to understand in my opinion:

def hyst(x, th_lo, th_hi, initial = False):
    outside_values = np.full(x.size, np.nan)
    outside_values[0] = initial
    outside_values[x < th_lo] = 0
    outside_values[x > th_hi] = 1
    outside_value_indexes = np.where(np.isnan(outside_values), 0, np.arange(x.size))
    np.maximum.accumulate(outside_value_indexes, out=outside_value_indexes)
    return outside_values[outside_value_indexes]

And running the same test as the other answer:

x = np.linspace(0,20, 1000)
y = np.sin(x)
h1 = hyst(y, -0.5, 0.5)
h2 = hyst(y, -0.5, 0.5, True)
plt.plot(x, y, x, -0.5 + h1, x, -0.5 + h2)
plt.legend(('input', 'output, start=0', 'output, start=1'))
plt.title('Thresholding with hysteresis')
plt.show()

example

The basic idea comes from this answer to a separate question about forward-filling:

  1. First we create an outside_values array that maps values from the input to 0 if below threshold, 1 if above threshold, or NaN as a placeholder otherwise.
  2. Then we create an outside_values_indexes array that lists all the indexes into outside_values (i.e. [0, 1, 2, 3, ...]). The indexes that map to inside values (NaNs) are replaced by 0.
  3. We replace outside_values_indexes with its cumulative maximum, i.e. each index is replaced by the maximum of all indexes that precede it in the array. Since we set the indexes of inside values to 0, they never contribute to the maximum and instead the maximum index of all preceding outside values (or, in other words, the index of the last outside value previously seen) is used.
  4. Now that outside_values_indexes maps each input value to the index of the last outside value, we can use that to index into outside_values and we're done!
Etienne Dechamps
  • 24,037
  • 4
  • 32
  • 31