I would like to subset a data table in R within each subset, based on an aggregate function over the subset of rows. For example, for each key, return all values greater than the mean of a field calculated only for rows in subset. Example:
library(data.table)
t=data.table(Group=rep(c(1:5),each=5),Detail=c(1:25))
setkey(t,'Group')
library(foreach)
library(dplyr)
ret=foreach(grp=t[,unique(Group)],.combine=bind_rows,.multicombine=T) %do%
t[Group==grp&Detail>t[Group==grp,mean(Detail)],]
# Group Detail
# 1: 1 4
# 2: 1 5
# 3: 2 9
# 4: 2 10
# 5: 3 14
# 6: 3 15
# 7: 4 19
# 8: 4 20
# 9: 5 24
#10: 5 25
The question is, is it possible to succinctly code the last two lines using data.table features? Sorry if this is a repeat, I am also struggling explaining the exact goal to have google/stackoverflow find it.