I was very shocked by the smoothness of dplyr package in flow-style data processing. Recently I rush into a problem to generate a new data frame for each group ID and combine those small data frames into a final larger data frame. A toy example:
input.data.frame %>%
group_by(gid) %>%
{some operation to generate a new data frame for each group} ## FAILED!!!!
In dplyr, the function mutate
adding new column to each group and summarise
generating summaries for each group, both can not fulfill my requirement. (Did I miss something?)
Alternatively, using ddply
of plyr package, the previous interation of dplyr, I can make it via
ddply(input.data.frame, .(gid), function(x) {
some operation to generate a new data frame for each group
}
But the shortage is some funcs in dplyr will be masked from availableness when I load the plyr package.