Combining variables with multiple data.frame attributes

Question

I have multiple variables with many data.frame attributes that look like this:

And I would like to combine them into a variable with data.frames like this:

I have tried rbind(df1, df2) but that creates a variable that looks like:

> rbind(df1, df2)[,"ML1"]
$df1
  Q2 Q3
1  1  1
2  0  0
3  0  1
4  1  1
5  0  0

$df2
   Q4 Q5
1   1  1
2   1  0
3   1  0
4   0  0
5   0  0
6   1  0
7   1  1
8   0  1
9   1  1
10  0  1

So the rows are not appended to each other in the same data.frame

What else do I need to do?

Do you mean `rbind`? "Error in rbind(deparse.level, ...) : numbers of columns of arguments do not match" or with `cbind`: "Error in data.frame(..., check.names = FALSE) : arguments imply differing number of rows: 55, 256, 121, 64, 40, 27, 93, 21, 22, 50, 53, 71, 51, 56, 29, 74, 16, 10, 24" — bountiful, Mar 03 '13 at 15:12
I mean `cbind` since you show a list of data.frames with 256 rows each. If you make your question [reproducible](http://stackoverflow.com/a/5963610/1412059), we can give better advice. — Roland, Mar 03 '13 at 15:17
I said underneath that the number of observations in each df differs. — bountiful, Mar 03 '13 at 15:17
@fophillips then how should they be merged? to re-iterate Roland's point, you need to provide a reproducible example that allows others to re-create the problem :) please review his link.. — Anthony Damico, Mar 03 '13 at 15:19
Well, how should that work then? For `cbind` you need the same number of rows, for `rbind` the same number of columns. Maybe you want to `merge`? Find out, what you actually want to achieve and then we can help with that. — Roland, Mar 03 '13 at 15:19

score 2 · Accepted Answer · answered Mar 03 '13 at 17:12

This assumes, as in your example, that all of your lists of data frames (df1, df2, etc) are consistent, specifically that if df1 has element "foo", then df2 also has element "foo", and that df1$foo and df2$foo are rbind-compatible data frames.

# create two sample lists of data.frames
df1 <- lapply(list(ML1=1:2, ML2=3:4), function(i) head(iris[i]))
df2 <- lapply(list(ML2=3:4, ML1=1:2), function(i) tail(iris[i]))
# store them in a list for easier reference
dfList <- list(df1, df2)
# for each data frame name, extract and rbind the corresponding data 
dfAll <- sapply(names(dfList[[1]]), function(col) do.call("rbind", 
    lapply(dfList, "[[", col)), simplify=FALSE)

This is similar in spirit to other answers, but produces output structured as posed in the question:

> dfAll
$ML1
    Sepal.Length Sepal.Width
1            5.1         3.5
2            4.9         3.0
3            4.7         3.2
4            4.6         3.1
5            5.0         3.6
6            5.4         3.9
145          6.7         3.3
146          6.7         3.0
147          6.3         2.5
148          6.5         3.0
149          6.2         3.4
150          5.9         3.0

$ML2
    Petal.Length Petal.Width
1            1.4         0.2
2            1.4         0.2
3            1.3         0.2
4            1.5         0.2
5            1.4         0.2
6            1.7         0.4
145          5.7         2.5
146          5.2         2.3
147          5.0         1.9
148          5.2         2.0
149          5.4         2.3
150          5.1         1.8

As an aside, assuming this is part of a larger workflow, if at all possible I'd modify the upstream code so that the data frames are assembled in dfList from the start, rather than as individual named objects that are then combined.

+1 for vectorizing the whole thing. In what help file can I read about using "[[" in the way you did? I hadn't noticed my final output structure deviated from what was requested. Oops. — russellpierce, Mar 04 '13 at 08:48
There is nothing verctorized here. `*apply` functions are loops. — Roland, Mar 04 '13 at 10:26

russellpierce · Answer 2 · 2013-03-04T08:52:06.227

The first thing to notice is that (it looks like) you don't have two data.frames. What you have is two lists each of which contains two data.frames. One that is named ML1 and the other that is named ML2.

#Example data for the purpose of being reproducible
df1 <- list(ML1=data.frame(Q2=1:5,Q3=LETTERS[1:5]),ML2=data.frame(Q4=6:10,Q5=LETTERS[6:10]))
df2 <- list(ML1=data.frame(Q2=11:15,Q3=LETTERS[11:15]),ML2=data.frame(Q4=16:24,Q5=LETTERS[16:24]))
# Lets look at the structure, just for educational purposes
str(df1)
str(df2)
# Okay, now how can we bind those lists together?  
# It turns out, we just use c because lists are really just vectors of type list.
list.all <- c(df1,df2)
# Now that all of the data is in one structure, we have hope.
# The rbindListsByName function does the heavy lifting.  
# Notice we couldn't just provide "nameToBind" otherwise it would just grab the first list with a matching name.  
#We needed a logical to pick out the lists we actually wanted.
rbindListsByName <- function(nameToBind,inputList) {
   do.call("rbind",inputList[names(inputList)==nameToBind])
}
dfAll <- sapply(names(df1),rbindListsByName,inputList=list.all,simplify=FALSE)

Fixed my answer so that it provides the requested output structure. — russellpierce, Mar 04 '13 at 08:52

score -1 · Answer 3 · answered Mar 03 '13 at 16:21

I'll give it a try. However, your question is still not reproducible.

#Here I try to recreate your objects:
ML1 <- read.table(text="Q2 Q3
1  1  1
2  0  0
3  0  1
4  1  1
5  0  0",header=TRUE)

ML2 <- read.table(text="Q4 Q5
1  0  1
2  1  1
3  1  0
4  1  0
5  0  0",header=TRUE)

df1 <- list(ML1=ML1,ML2=ML2)

ML2 <- read.table(text="Q4 Q5
1   1  1
2   1  0
3   1  0
4   0  0
5   0  0
6   1  0
7   1  1
8   0  1
9   1  1
10  0  1",header=TRUE)

ML1 <- read.table(text="Q2 Q3
1   0  0
2   1  1
3   0  1
4   1  0
5   0  1
6   1  0
7   0  1
8   1  0
9   1  1
10  0  0",header=TRUE)

df2 <- list(ML2=ML2,ML1=ML1)

I don't see a reason to keep ML1 and ML2 separate. Thus, I cbind them.

#combine the lists in a list
mylist <- list(df1,df2)

#cbind the data.frames in each sublist
mylist <- lapply(mylist,function(x) do.call("cbind",x))
#rbind the resulting data.frames
do.call("rbind",mylist)

#   ML1.Q2 ML1.Q3 ML2.Q4 ML2.Q5
#1       1      1      0      1
#2       0      0      1      1
#3       0      1      1      0
#</snip>

Combining variables with multiple data.frame attributes

3 Answers3