I am working on solving another problem here: Fill in time series gaps with both LCOF and NOCB methods but acknowledge breaks in time series and it raised a new issue.
I have a time series dataset with missing data (NA).
> str(final_daily_intake2)
'data.frame': 387 obs. of 11 variables:
$ Date : chr "2014-08-13" "2014-08-14" "2014-08-15" "2014-08-16" ...
$ MEID.1 : chr NA NA NA "14" ...
$ MEID.2 : Factor w/ 184 levels "1","100","100.1",..: NA NA NA 143 48 NA NA NA NA NA ...
$ MEID.3 : Factor w/ 180 levels "100","100.1",..: NA NA NA 24 134 NA NA NA NA NA ...
$ MEID.4 : Factor w/ 42 levels "173","173a","173b",..: NA NA NA 17 1 NA NA NA NA NA ...
$ MEID.5 : Factor w/ 3 levels "d1","s1","s2": NA NA NA 2 3 NA NA NA NA NA ...
$ MEID.6 : Factor w/ 1 level "s2": NA NA NA NA NA NA NA NA NA NA ...
$ DAYT : int NA NA NA 1 8 NA NA NA NA NA ...
$ DATT : int NA NA NA 1 1 NA NA NA NA NA ...
$ Reason.For.Change: chr "0" "0" "0" "0" ...
$ GAP_Days : chr "1" "2" "3" "NA" ...
And head() on the data gives
Date MEID.1 MEID.2 MEID.3 MEID.4 MEID.5 MEID.6 DAYT DATT Reason.For.Change GAP_Days
1 2014-08-13 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 1
2 2014-08-14 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 2
3 2014-08-15 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 3
4 2014-08-16 14 61 117 187 s1 <NA> 1 1 0 NA
5 2014-08-17 14a 193 56 173 s2 <NA> 8 1 0 NA
6 2014-08-18 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 1
I have labeled consecutive days in the missing data time periods in column GAP_Days
.
I would like to create a new column in the dataframe that labels the missing data time period. For example, this dataset has 9 missing time periods. The new column GAP_LABEL would simply start at 1 and end at 9. For example, in the first missing time period in this data set, the GAP_LABEL value would be 1 for the first three dates, since they all belong to the same missing data time period.
Anticipated outcome
Date MEID.1 MEID.2 MEID.3 MEID.4 MEID.5 MEID.6 DAYT DATT Reason.For.Change GAP_Days GAP_LABEL
1 2014-08-13 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 1 1
2 2014-08-14 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 2 1
3 2014-08-15 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 3 1
4 2014-08-16 14 61 117 187 s1 <NA> 1 1 0 NA
5 2014-08-17 14a 193 56 173 s2 <NA> 8 1 0 NA
6 2014-08-18 <NA> <NA> <NA> <NA> <NA> <NA> NA NA 0 1 2
I have not seen other examples that demonstrate how this would be done. If anyone has suggestions, that would be great.