I'm looking for an easy way to find "plateaus" or groups in python lists. As input, I have something like this:
mydata = [0.0, 0.0, 0.0, 0.0, 0.0, 0.143, 0.0, 0.22, 0.135, 0.44, 0.1, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.33, 0.65, 0.22, 0.0, 0.0, 0.0, 0.0, 0.0]
I want to extract the middle position of every "group". Group is defined in this case as data that is !=0
and for example at least 3 positions long. Enclaved single zeros (like on position 6) should be ignored.
Basically, I want to get the following output:
myoutput = [8, 20]
For my use case, it is not really important to get very precise output data. [10,21]
would still be fine.
To conclude everything: first group: [0.143, 0.0, 0.22, 0.135, 0.44, 0.1]
; second group: [0.33, 0.65, 0.22]
. Now, the position of the middle element (or left or right from the middle, if there is no true middle value). So in the output 8
would be the middle of the first group and 20
the middle of the second group.
I've already tried some approaches. But they are not as stable as I wanted them to be (for example: more enclaved zeros can cause problems). So before investing more time in this idea, I wanted to ask if there is a better way to implement this feature. I even think that this could be a generic problem. Is there maybe already standard code that solves it?
There are other questions that describe roughly the same problem, but I have also the need to "smooth" the data before processing.
smooth the data - get rid of enclaved zeros
import numpy as np def smooth(y, box_pts): box = np.ones(box_pts)/box_pts y_smooth = np.convolve(y, box, mode='same') return y_smooth y_smooth = smooth(mydata, 20)
find start points in the smooth list (if a value is
!=0
and the value before was 0 it should be a start point). If an endpoint was found: use the last start point that was found and the current endpoint to get the middle position of the group and write it to a deque.laststart = 0 lastend = 0 myoutput = deque() for i in range(1, len(y_smooth)-1): #detect start: if y_smooth[i]!=0 and y_smooth[i-1]==0: laststart = i #detect end: elif y_smooth[i]!=0 and y_smooth[i+1]==0 and laststart+2 < i: lastend = i myoutput.appendleft(laststart+(lastend-laststart)/2)
EDIT: to simplify everything, I gave only a short example for my input data at the beginning. This short list actually causes a problematic smoothing output - the whole list will get smoothed, and no zero will be left. actual input data; actual input data after smoothing