I have a dataframe, df
and a function process
that returns a list of two dataframes, a
and b
. I use dlply to split up the df
on an id column, and then return a list of lists of dataframes. Here's sample data/code that approximates the actual data and methods:
df <- data.frame(id1=rep(c(1,2,3,4), each=2))
process <- function(df) {
a <- data.frame(d1=rnorm(1), d2=rnorm(1))
b <- data.frame(id1=df$id1, a=rnorm(nrow(df)), b=runif(nrow(df)))
list(a=a, b=b)
}
require(plyr)
output <- dlply(df, .(id1), process)
output
is a list of lists of dataframes, the nested list will always have two dataframes, named a
and b
. In this case the outer list has a length 4.
What I am looking to generate is a dataframe with all the a
dataframes, along with an id
column indicating their respective value (I believe this is left in the list as the split_labels
attribute, see str(output)). Then similarly for the b
dataframes.
So far I have in part used this question to come up with this code:
list <- unlist(output, recursive = FALSE)
list.a <- lapply(1:4, function(x) {
list[[(2*x)-1]]
})
all.a <- rbind.fill(list.a)
Which gives me the final a
dataframe (and likewise for b
with a different subscript into list
), however it doesn't have the id column I need and I'm pretty sure there's got to be a more straightforward or elegant solution. Ideally something clean using plyr
.