I have a dataframe that has two different ID's with same event_time
. I am supposed to aggregate this dataframe to 1 hour and take mean value of remaining columns,
id event_time 1 2 3 4 33 34 38 39 41 42
1 1001 2017-05-22 16:56:07 NA NA NA NA NA NA NA 1215.35 NA NA
2 1001 2017-05-22 16:57:07 NA NA NA NA NA NA 53.5 1243.36 0.24 0.20
3 1001 2017-05-22 16:58:07 NA NA NA NA NA NA 53.8 1234.08 0.71 0.88
4 1001 2017-05-22 16:59:07 NA NA NA NA NA NA 53.2 1236.73 0.55 0.42
5 1001 2017-05-22 17:00:08 NA NA NA NA NA NA 53.8 1257.87 0.43 0.36
6 1001 2017-05-22 17:01:08 NA NA NA NA NA NA 52.8 1222.55 0.78 0.42
....
id event_time 1 2 3 4 33 34 38 39 41 42
95 1002 2017-05-22 16:56:50 NA NA NA NA NA NA NA 1220.35 NA NA
96 1002 2017-05-22 16:57:07 NA NA NA NA NA NA 53.5 1233.36 0.24 0.20
97 1002 2017-05-22 16:58:17 NA NA NA NA 44 NA 53.8 1256.08 0.71 0.88
98 1002 2017-05-22 16:59:33 NA 11 NA NA NA NA 53.2 1277.73 0.55 0.42
99 1002 2017-05-22 17:00:21 NA 11 NA NA 56 NA 53.8 1288.87 0.43 0.36
100 1002 2017-05-22 17:01:10 NA 19 NA NA NA NA 52.8 1201.55 0.78 0.42
I used dplyr package to use group_by
for ID's and then aggregate. But it throws error
data_1hour <- data %>% group_by(id) %>% aggregate(list( Tag_1 = data$`1`, Tag_2 = data$`2`,
Tag_3 = data$`3`, Tag_4 = data$`4`,
Tag_33 = data$`33`,Tag_34 = data$`34`,
Tag_38 = data$`38`,
Tag_39 = data$`39`,Tag_40 = data$`41`,
Tag_42 = data$`42`),
list(timestamps = cut(data$event_time, "1 hour")),mean, na.rm = "TRUE")
Error in match.fun(FUN) : 'list(timestamps = cut(data$event_time, "1 hour"))' is not a function, character or symbol
I have too many NA values and would like to ignore it and so i used na.omit = true
. How do i proceed with this?