6

I have two arrays of data taken from an experiment; x (time) and y (voltage). These are plotted below. Every 200 microseconds, the signal is one of three states:

  1. Take a negative value (i.e. < -0.25 V)
  2. Take a positive value (i.e. > 0.3 V)
  3. Remain at the level of the noise (i.e. ~between 0 and 0.1 V)

x,y plotted

I would like to 'digitise' this signal, so that case 1., 2., and 3., correspond to discrete values of -1, +1 or 0.

How to do this best by analysing the arrays of data?

My current idea:

  • To interpolate to find x values where y meet each threshold.

np.interp(0.5,x,y)

Issue:

  • How do this for multiple positions where the y threshold is met?
user
  • 5,370
  • 8
  • 47
  • 75
8765674
  • 1,234
  • 4
  • 17
  • 32
  • You already seem to have the solution sorted out; you just have to turn the cases to code directly. Do you have any other problem in fixing the threshold? – Ébe Isaac Nov 18 '16 at 18:41
  • You already know the behavior you want, so can you simply write a function that implements that behavior and map it over your data? – user3030010 Nov 18 '16 at 18:43
  • I mention that I'm unsure how to do this so that I count all of the positions in the trace where the threshold is met, rather than just the first case, which `np.interp()` returns – 8765674 Nov 18 '16 at 18:51

4 Answers4

2

For N thresholds, there are N+1 levels. Level crossings are found using the following steps:

  • Iterate through all the samples and then thresholds,
  • Check the current level and sample for a crossing, by comparing the products of a combination of absolute values, for example: abs(y1)*y0 != y1*abs(y0).
  • If a level crossing has occurred, the x-value of the crossing is found using a linear equation.
  • The polarity of the slope tells you if the crossing came from a level below or above, indicating the current level.

The following function implements the above:

def quantizer(time,voltage,thresholds):
    x0=time[0]; y0=voltage[0];
    l0 = 0
    for level,thresh in enumerate(thresholds):
        l0 = level if y0>thresh else l0
    xings=[tuple([x0,l0])]
    for x1,y1 in zip(time[1:],voltage[1:]):
        xings_i=[]
        dy=y1-y0
        dx=x1-x0
        dy_dx = dy/dx
        for level,thresh in enumerate(thresholds):
            y1s=y1-thresh
            y0s=y0-thresh
            if dy!=0:
                if abs(y1s)*y0s!=y1s*abs(y0s):
                    xings_i.append(tuple([x0-y0s/dy_dx,level+1*(dy_dx>0)]))
        xings_i.sort(key=lambda tup: tup[0])
        xings+=xings_i
        x0=x1
        y0=y1
    return xings

The following inputs were used to test the function:

time =    [  0,   1,   2,   3,   4 ]
voltage = [-2.0,-1.2, 2.0, 1.1,-4.0]
thresholds = [-0.25,0.3]

The levels returned by this function are unsigned and start from zero, so the result of the function is shifted down by one:

def main():
    for t,level in quantizer(time,voltage,thresholds):
        print(str(round(t,2)) + "\t" + str(level-1))

The results look as follows:

0       -1
1.3     0
1.47    1
3.16    0
3.26    -1
Tjaart
  • 496
  • 8
  • 20
0

Assuming you have our data in a pandas datatable (data with columns x and y), I would first clean your data a bit (digitize it).

data.loc[data['y'] > 0.3, 'y'] = +1
data.loc[data['y'] < -0.25, 'y'] = -1
data.loc[(data['y'] >= -0.25) & (data['y'] <= 0.3), 'y'] = 0

Afterwars I would check the slope by calculating the difference between supsequent elements

data_diff = data.set_index('x').diff(periods=1, axis=0)

Now run through your differences and count

state = 0
pos_peaks = 0
neg_peaks = 0
flats = 0
for row in data_diff.itertuples():
    if row[1] > 0 and state == 0:
        pos_peaks += 1
        state = 1
    elif row[1] < 0 and state == 0:
        neg_peaks +=1
        state = -1
    elif row[1] < 0 and state == 1:
        flats += 1
        state = 0
    elif row[1] > 0 and state == -1:
        flats += 1
        state = 0
RaJa
  • 1,471
  • 13
  • 17
  • Not using pandas. What is the purpose of `.diff`? – 8765674 Nov 21 '16 at 14:04
  • `.diff()` is a function in pandas that calculates the difference along an axis. In the given case it calcultes the difference between suceeding rows. see http://stackoverflow.com/a/13115473/4141279 – RaJa Nov 21 '16 at 14:14
0

As one of the solutions you can replace them with required values:

 >>> a=np.random.rand(10)-0.5  # just for test
 >>> a
 array([-0.16532652,  0.17079655, -0.08183734, -0.42446642, -0.15749626,
         0.31809266,  0.09871911,  0.10324168,  0.43021048,  0.16749866])
 >>> a[a<-0.3] = -1
 >>> a[a>0.25] = +1
 >>> a
 array([-0.16532652,  0.17079655, -0.08183734, -1.        , -0.15749626,
         1.        ,  0.09871911,  0.10324168,  1.        ,  0.16749866])
 >>> a[(a>-1) & (a<1)]=0
 >>> a
 array([ 0.,  0.,  0., -1.,  0.,  1.,  0.,  0.,  1.,  0.])
 >>>
Eugene Lisitsky
  • 12,113
  • 5
  • 38
  • 59
0

You can discretize the data by doing checks like my_array > 0.3, which returns a boolean array that can be turned to 1's and 0's: np.array( my_array > 0.3 ,int).

The positions of the jumps can be found by subtracting a shifted signal (inserting a copy of the first element of the signal) from the signal:

def jump(time,signal):
    d_signal      = np.array(signal>.3,int) -np.array(signal<-.25,int)
    shift         = d_signal - np.insert(d_signal,0,d_signal[0])[:-1]
    time_of_jump  = time[shift!=0]
    jump_to_value = d_signal[shift!=0]

    return time_of_jump,jump_to_value

I've tested this on some random signal that looks somewhat similar to yours:

import numpy as np
import matplotlib.pyplot as plt

jumps=np.around(np.random.rand(30)*2-1)*.5
signal=np.array([])

for i in jumps:
    signal=np.hstack((signal , np.ones(10)*i + np.random.normal(0,.05,10)))
time=np.arange(0,len(signal))*.01+239

print(jump(time,signal)) # Python 3
print jump(time,signal)  # Python 2

Returns an array of the times the jumps have occurred (the first element with the new value) and the values (+1,0,-1) to which the signal has jumped.

Haminaa
  • 388
  • 1
  • 7