How can I find the duration a list element exceeds a certain criteria?

Question

I'm analysing EEG data for my honours project. Specifically, I'm analysing EEG bursts (i.e. the number of counts that it exceeds a criteria). The EEG values are recorded every second. In my case, the criteria of a EEG burst is every time the EEG value exceeds 15% of the 5-minute mean. However, I would also like to find the duration that it remains above 15% of the mean. How can I do this?

edit: this is the data for a 5-min block, regions2[0][0][0]

array([2.6749268, 2.3965783, 2.2648365, 2.2901094, 2.5218956, 2.6369736,
   2.3637865, 2.1217346, 1.895203 , 2.055559 , 1.9365673, 2.4218671,
   2.530769 , 2.385391 , 2.293663 , 2.3126507, 2.323733 , 2.2733889,
   2.3903291, 2.6176193, 2.6430926, 2.58586  , 2.352392 , 2.4955454,
   2.6099124, 2.3200274, 2.1760976, 2.5159674, 2.76305  , 2.3733828,
   2.4342089, 2.4008656, 2.2075768, 2.232682 , 2.2406263, 2.4858663,
   2.4188566, 2.5680597, 2.5303915, 2.3958497, 2.2115357, 2.444274 ,
   2.5103524, 2.0567694, 2.441487 , 2.430129 , 2.4614134, 2.282298 ,
   2.4610975, 2.5782802, 2.3088896, 2.660237 , 2.8228939, 2.386515 ,
   1.9969627, 2.0703123, 2.891341 , 2.929259 , 2.3676789, 2.39686  ,
   2.559953 , 2.4817688, 2.4235504, 2.2657301, 2.6064477, 2.6751654,
   2.5263813, 2.1663566, 2.2710345, 2.6688013, 2.1095626, 2.560567 ,
   2.6420567, 2.3834925, 2.4658787, 2.6067703, 2.5786612, 2.6147954,
   2.5842502, 2.5785747, 2.427758 , 2.2909386, 2.2653525, 2.382083 ,
   2.5664327, 2.5153337, 1.820536 , 2.5582454, 2.3047743, 2.0991004,
   2.4578576, 2.7292717, 2.6083386, 2.4281838, 2.453028 , 2.4099083,
   2.3806388, 2.3578563, 2.590041 , 2.621177 , 2.3468106, 2.145658 ,
   2.077852 , 2.2439861, 2.6040363, 2.7262418, 2.5456822, 2.4032714,
   2.4305286, 2.3440735, 2.4467494, 2.8309298, 2.484087 , 2.4205194,
   2.9501045, 2.9746544, 2.4083674, 2.3036501, 2.6792996, 2.3589804,
   2.3387434, 3.1610718, 3.1351097, 2.5165584, 2.52014  , 2.4717925,
   2.3442857, 2.4484215, 2.6329467, 2.5656624, 2.1032746, 2.6107414,
   2.6543136, 2.3989596, 2.491326 , 2.448652 , 2.0739408, 2.1546881,
   2.2125206, 2.3453302, 2.206572 , 2.5481203, 2.4757648, 2.321009 ,
   2.151885 , 2.5293174, 2.4324925, 2.5090148, 2.511013 , 2.3891613,
   2.410615 , 2.4933898, 2.5637872, 2.5406888, 2.3262205, 2.3038528,
   2.3525562, 2.5020993, 2.418803 , 2.3449333, 2.3671083, 2.2996266,
   2.3794844, 2.567588 , 2.4661932, 2.2262957, 2.23916  , 2.4768639,
   2.5996208, 2.0967448, 2.6293585, 2.63859  , 2.1346848, 2.396465 ,
   2.5590088, 2.7100194, 2.5844605, 2.509511 , 2.3888776, 2.3450603,
   2.3059156, 2.3003197, 2.265796 , 2.409108 , 2.3471978, 2.3690042,
   2.3681839, 2.1804225, 2.3097868, 2.5214322, 2.5647783, 2.2263277,
   2.237076 , 2.1725235, 2.7022493, 2.6051672, 2.3920796, 2.5295384,
   2.3592722, 2.3246412, 2.6447232, 2.471837 , 2.7084064, 2.891163 ,
   2.853586 , 2.5431767, 2.626647 , 2.384906 , 2.4660795, 2.3987703,
   1.952864 , 2.200941 , 2.0904007, 2.3755703, 2.4843922, 2.2047417,
   2.356966 , 2.3752437, 2.3846717, 2.306039 , 2.1720207, 2.3954203,
   2.3085623, 2.3531506, 1.9799676, 2.211963 , 2.141896 , 2.23061  ,
   2.3369045, 2.0759099, 2.3769715, 2.2083194, 2.3833442, 2.519347 ,
   2.3158543, 2.349166 , 2.4172142, 2.4422796, 2.513461 , 2.3867416,
   2.2908096, 2.5697057, 2.5763984, 2.319024 , 2.2902663, 2.6102533,
   2.6331408, 2.0932658, 2.323995 , 2.5237315, 2.5517514, 2.2664344,
   2.6006377, 2.3638246, 2.1815908, 2.440507 , 2.319951 , 2.399297 ,
   2.4986413, 2.3624146, 2.2377589, 2.277511 , 2.2512653, 2.1235788,
   2.1558921, 2.0374463, 2.2526782, 2.3825428, 2.2467673, 2.2891142,
   2.2968547, 2.2971904, 2.3470707, 2.2792215, 2.4101918, 2.4403389,
   2.674443 , 2.7500827, 2.4176645, 2.4565153, 2.5339708, 2.545448 ,
   2.3907104, 2.4448986, 2.571172 , 2.3371966, 2.5683126, 2.6280603,
   2.4276655, 2.4700544, 2.421201 , 2.499506 , 2.555047 , 2.7138271,
   2.546179 , 2.484039 , 2.3956237, 2.591265 , 2.6552162, 2.6930716],
  dtype=float32)

1.15 * np.mean(regions2[0][0][0])
Out[17]: 2.783561062812805

np.where(regions2[0][0][0] > 1.15 * np.mean(regions2[0][0][0]))
Out[13]: (array([ 52,  56,  57, 111, 114, 115, 121, 122, 203, 204], dtype=int64),)

edit: Looking at the index numbers of values greater than 1.15 of the mean, I realised that shows the duration that I'm looking for. i.e. if it's a single number 52 then its a 1 second burst, if it's 56 and 57 (2 consecutive index numbers), then its a 2 second burst. How can I make a script to calculate the average burst durations for this each 5 min block? In this example, it would be average of (1s + 2s + 1s + 2s + 2s + 2s)

I've attached a code snippet that works and is able to count the number of bursts by iterating through the ndarray and comparing two consecutive EEG values (add 1 count of burst (temp +=1) if the former value is less than 1.15 of the mean and the latter value is more than 1.15 of the mean). This is for one 5-minute block and I repeat this for 18 5-minute blocks.

burst = 0
for l in range(1,len(regions[i][j][k])): #regions[i][j][k] refers to a 5-min block of EEG data
    if regions[i][j][k][l-1] < (1.15 * np.nanmean(regions[i][j][k])) and regions2[i][j][k][l] > (1.15 * np.nanmean(regions2[i][j][k])):
        burst += 1 #counts a burst

I was thinking that I could find the index number of the point at which the burst occurs and index number of the value at which the burst ends (EEG value drops below 1.15 of the mean). The duration, in seconds, would be the difference between the two index values.

However, I have no idea how to do this, or if this is the best way. These bursts occur multiple times so my ultimate objective is to find the average values of these burst every 5 minutes. I just started learning python a few months back so any help would be greatly appreciated!

It's hard to help you out without data. Can you provide a constructed np.ndarray with data and [edit]/add it to your question? — Patrick Artner, Sep 21 '19 at 10:17
Hi @PatrickArtner, thanks for replying! I've added the data for one 5-minute block and I think the solution may be (a bit) simpler than I thought. By looking at the index values of elements that meet my criteria, I can see the duration of each 'burst' (I described it in the edited paragraph). I'm just not sure how to go about counting these index numbers. Hope I didn't confuse you. — lowlet, Sep 21 '19 at 11:14

Patrick Artner · Answer 1 · 2019-09-21T11:49:00.480

You can get these regions using numpy. I leveraged part of an answer from another question to do this.

Results with index positions and number of points between them as well as screenshot of console/plot:

The code to do this (You had no data given, I used a sine-wave as base):

import numpy as np
import datetime
import matplotlib.pyplot as plot

base = datetime.datetime(2019, 1, 1,8,0,0)
tarr = np.array([base + datetime.timedelta(seconds=i) for i in range(60*60)]) #  1 h

time = np.arange(0, 6*6, 0.01) *0.2;
amp = np.sin(time)

# get all indexes that interest me - yours would have the mean here
# not used in rest of code: contains all indexes that are above 0.75
indx = np.nonzero(abs(amp) > 0.75)
# looks like: (array([ 425,  426,  427, ..., 3597, 3598, 3599], dtype=int64),)


def contiguous_regions(cond):
    """Credits: https://stackoverflow.com/a/4495197/7505395"""
    """Finds contiguous True regions of the boolean array "condition". Returns
    a 2D array where the first column is the start index of the region and the
    second column is the end index."""

    # Find the indicies of changes in "condition"
    d = np.diff(cond)
    idx, = d.nonzero() 

    # We need to start things after the change in "condition". Therefore, 
    # we'll shift the index by 1 to the right.
    idx += 1

    if condition[0]:
        # If the start of condition is True prepend a 0
        idx = np.r_[0, idx]

    if condition[-1]:
        # If the end of condition is True, append the length of the array
        idx = np.r_[idx, cond.size] 

    # Reshape the result into two columns
    idx.shape = (-1,2)
    return idx


condition = np.abs(amp) > 0.75

# create a plot visualization
fig, ax = plot.subplots()
ax.plot(tarr, amp)

# print values that fullfill our condition - also do some plotting
for start, stop in contiguous_regions(condition):
    segment = amp[start:stop]
    print (start, stop, f"Amount: {stop-start}")

    # plot some lines into the graph, off by 1 for stops in graph
    ax.vlines(x=tarr[start], ymin=amp[start]+(-0.1 if amp[start]>0 else 0.1),
                             ymax=1.0 if amp[start]>0 else -1.0, color='r')
    ax.vlines(x=tarr[stop-1], ymin=amp[start]+(-0.1 if amp[start]>0 else 0.1), 
                              ymax=1.0 if amp[start]>0 else -1.0, color='r')
    ax.hlines(y=amp[start]-0.08,xmin=tarr[start],xmax=tarr[stop-1])

# show plot    
plot.show()

The trick is to apply ndp.diff to the array with [False,True,...]'s in them that fullfill the condition. Then get the indexes from np.nonzero() result-tuple.

How can I find the duration a list element exceeds a certain criteria?

1 Answers1

Linked