9

For a data analysis task, I want to find zero crossings in a numpy array, coming from a convolution with first a sobel-like kernel, and then a mexican hat kernel. Zero crossings allow me to detect edges in the data.

Unfortunately, the data is somewhat noisy and I only want to find zero crossings with a minimal jump size, 20 in the follwing example:

import numpy as np
arr = np.array([12, 15, 9, 8, -1, 1, -12, -10, 10])

Should result in

>>>array([1, 3, 7])

or

>>>array([3, 7])

Where 3 is the index of -1, just before the middle of the first jump and 7 is the index of -10

I have tried a modification of the following code (source: Efficiently detect sign-changes in python)

zero_crossings = np.where(np.diff(np.sign(np.trunc(arr/10))))[0]

Which correctly ignores small jumps, but puts the zero-crossings at [1,5,7]

What would be an efficient way of doing this?

The definition of minimal jump is not strict, but results should be along the lines of my question.

Edit: For Clarification

arr = np.array([12, 15, 9, 8, -1, 1, -12, -10, 10])
arr_floored = np.trunc(arr/10)
>>>>np.array([10, 10, 0, 0, 0, 0, -10, -10, 10])
sgn = np.sign(arr_floored)
>>>>array([ 1,  1,  0,  0,  0,  0, -1, -1,  1])
dsgn = np.diff(sgn)
>>>>array([ 0, -1,  0,  0,  0, -1,  0,  2])
np.where(dsgn)
>>>>(array([1, 5, 7], dtype=int64),)

Further edgecases:

arr = [10,9,8,7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10]

Should result in

>>> np.array([10])

Also just noticed: The problem might be ill-posed(in a mathematical sense). I will clarify it later today.

AlexNe
  • 926
  • 6
  • 22

2 Answers2

3

Base case

I guess you want

import numpy as np
x = np.array([10, -50, -30, 50, 10, 3, -200, -12, 123])
indices = np.where(np.logical_and(np.abs(np.diff(x)) >= 20, np.diff(np.sign(x)) != 0))[0]

read as: indices, where ((absolute differences of x) are larger or equal 20) and (the sign flips)

which returns

array([0, 2, 5, 7])

Periodic signal

The usual numpy functions don't cover this case. I would suggest simply adding the first element in the end, via the pad function:

import numpy as np
x = np.array([10, 5, 0, -5, -10])
x = np.pad(x, (0, 1), 'wrap')
indices = np.where(np.logical_and(np.abs(np.diff(x)) >= 20, np.diff(np.sign(x)) != 0))[0]
Obay
  • 426
  • 4
  • 10
  • 1
    Will test tomorrow morning :) – AlexNe May 27 '19 at 15:34
  • What about the case `[10,9,8,7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10]`? There should be exactly 1 zero crossing at 0, since the total difference in this vector is 20. But your code does not detect any. – AlexNe May 29 '19 at 06:21
  • Hm...I think it may be impossible to find a way which works for all cases you might have in mind. For this case to work, you need to do some kind of down sampling or an equivalent operation which will necessarily results in ignoring actual zero jumps at this resolution – Zaw Lin May 29 '19 at 07:03
  • 1
    I see, your signal is meant to be periodic. Sadly, the usual numpy functions don't really cover that case. I guess you need an additional oneliner there. In principle there are several options, depending on the performance you require. Also, the zero crossing is not supposed to be at 0, because the jump 10->9 is not a zero crossing here. It rather is at 20. – Obay May 29 '19 at 07:53
  • I edited the answer with padding, that should be the most readable approach – Obay May 29 '19 at 08:07
  • The signal is not periodic. In my comment it crosses 0 coming from 10, going to -10. Hence a jump of 20. The code in my anwer puts the 0 crossing at `index=0` and `index=19`, but it should be in the middle of those two. Anyway, upvote for effort! And sorry for my unclear comment. Zero crossing in the middle of the vector. Just by coincidence that's also where `x=0`! – AlexNe May 29 '19 at 09:15
  • Ah, so you don't want only zero crossings between pairs, but also further distances. I am not sure about the exact specifications here. An image of what you want to detect might help. – Obay May 29 '19 at 09:20
3

Here's a solution that gives the midpoint of crossings involving a noise threshold to filter potentially multiple fluctuations around zero applied across multiple data points. It give the correct answers for the two examples you supplied. However, I've made a couple of assumptions:

  • You didn't define precisely what range of data points to consider to determine the midpoint of the crossing, but I've used your sample code as a basis - it was detecting crossings where ABS(start | end) >= 10 hence I've used the minimum range where this condition holds.
    NB: This does not detect a transition from +15 to -6.
    EDIT: Actually it's not always the minimum range, but the code should be enough for you to get started and adjust as needed.
  • I've assumed that it is ok to also use pandas (to track the indexes of data points of interest). You could probably avoid pandas if essential.

import numpy as np import pandas as pd arr = np.array([12, 15, 9, 8, -1, 1, -12, -10, 10]) sgn = pd.Series(np.sign(np.trunc(arr/10))) trailingEdge = sgn[sgn!=0].diff() edgeIndex = np.array(trailingEdge[trailingEdge!=0].index) edgeIndex[:-1] + np.diff(edgeIndex) / 2

gives:

array([3., 7.])

and

arr = [10,9,8,7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10]

gives:

array([10.])

Mike
  • 3,722
  • 1
  • 28
  • 41