I have a dataset of values for a one-year period for which I want to detect and count the periods of consecutive values above/below a pre-specified threshold value. I'd like to have returned simply the length of each period of consecutive above/below-threshold values. I found code online that does almost exactly what I want to do (shown below, the function titled "fire_season_length"), except it has trouble returning the final consecutive period before the dataset ends (at the end of the year).
I believe this problem is because a period of consecutive values is only reported once the series of values flips from above (below) threshold to below (above) threshold.
Here is the function I am using to count consecutive above/below-threshold periods:
def fire_season_length(ts, threshold):
ntot_ts = ts.count() #total number of values in ts (timeseries)
n_gt_threshold = ts[ts >= threshold].count() #number of values greater than threshold
type_day = 0 #below threshold
type_day = 1 #meets or exceeds threshold
type_prev_day = 0 #initialize first day
storage_n_cons_days = [[],[]] #[[cons days above threshold], [cons days below threshold]]
n_cons_days = 0
for cur_day in ts: #current day in timeseries
if cur_day >= threshold:
type_cur_day = 1
if type_cur_day == type_prev_day: #if same as current day
n_cons_days += 1
else: #if not same as current day
storage_n_cons_days[1].append(n_cons_days)
n_cons_days = 1
type_prev_day = type_cur_day
else:
type_cur_day = 0
if type_cur_day == type_prev_day:
n_cons_days += 1
else:
storage_n_cons_days[0].append(n_cons_days)
n_cons_days = 1
type_prev_day = type_cur_day
return ntot_ts, n_gt_threshold, storage_n_cons_days
And this is the output when I run a timeseries through the function; I've annotated the plot to show that there are 7 periods of consecutive values, yet the array that is returned [[13,185,30], [24, 78, 12]] (which indicates [[periods above threshold],[periods below threshold]]) only lists six such periods. It seems that period 7 is not reported in the output, which is consistent with the output from other timeseries I tested in this function as well.See annotated plot here
So my question is: how do I get my code to return the final period of consecutive values, even though the series of values has not flipped to be of the other sign (above/below threshold)?