I am trying to figure out how to construct a subset of unique ID’s (a vector of ID’s where each ID only appears once) that have time differences at least two standard deviations above or below the average time difference. This subset must also contain all rows from these ID’s and all columns. The subsetted dataset must also be ordered by ID.
This is the code I have attempted already:
data <- myDataset[unique(myDataset$ID) & myDataset$timeDiff >= mean_time_diff_rounded + 2 * sd_time_diff_rounded |
myDataset$timeDiff <= mean_time_diff_rounded - 2 * sd_time_diff_rounded,]
Usually, the unique(function) gets rid of repeated values but it is not working for some reason. I am not sure where to go from here. Additionally, to order the final subsetted dataset I'm assuming I would have to use the order() function but I am not sure how to correctly use it in this context.
Any help would be appreciated!