0

I have a list of data frames, and I want to create a new data frame object with the data of all data frames.

I do not need to check any duplicity, since I do not have repetition of data, but I cannot find a function to append data from a data frame to another.

I tryed to use the merge functions as follows, with no sucess:

folds is a list() where each element is a data frame, all with the same structure but different data.

  #Copies the structure to a new Object
  trainingSub <- folds[[1]][0,]
  #append data
  for(y in 1:K){
    if(y!=i){
      trainingSub <- merge(trainingSub,folds[[y]],all=TRUE)
    }
  }
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
f.leno
  • 151
  • 1
  • 7

4 Answers4

4

By the sounds of it, you are looking for the classic:

do.call(rbind, folds)

which will append a list of data.frames together by row.

If you need to combine by column instead, the approach would be:

do.call(cbind, folds)
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
2

How about this?

do.call(rbind,folds)
xb.
  • 1,617
  • 11
  • 16
2

You could try using rbindlist:

library(data.table)
xmpl <- list(data.frame(a = 1:3),
             data.frame(a = 4:6),
             data.frame(a = 7:9))

rbindlist(xmpl)
#    a
# 1: 1
# 2: 2
# 3: 3
# 4: 4
# 5: 5
# 6: 6
# 7: 7
# 8: 8
# 9: 9

rbindlist is faster, but less flexible than the do.call approach. There is no rbindlist equivalent for quickly doing a cbind.

dayne
  • 7,504
  • 6
  • 38
  • 56
  • 2
    Note: that function requires the `data.table` package. – Frank Apr 29 '15 at 19:44
  • dayne, in what way do you mean less flexible? – Arun Apr 30 '15 at 06:40
  • 1
    dayne, so does `do.call`.. no? – Arun Apr 30 '15 at 12:08
  • @Arun Sorry. I was typing that on my phone this morning, and did not finish my thought. Then the page reloaded and I did not think it posted. – dayne Apr 30 '15 at 15:07
  • @Arun Really, the functionality that `rbindlist` will not handle is a list of vectors. `rbindlist` requires a list of data.table of data.frame objects, so I have ended up having to use the `do.call(rbind, ...)` method when working with a list of vectors. There is also no equivalent for `cbind`. It was not meant to be a knock on `rbindlist` -- it's my preferred method. – dayne Apr 30 '15 at 15:09
2

dplyr and plyr alternatives to the other great approaches listed here:

# Using dplyr
library(dplyr)
data.frame(rbind_all(folds))

# Using plyr
library(plyr)
data.frame(rbind.fill(folds))

These both perform the same function as do.call() with rbind but offer some performance improvements.

Benchmarks:

folds <- NULL
for (i in 1:2000) {
    folds[[i]] <- data.frame(matrix(runif(100), 10, 10))
}

system.time({ x1 <- do.call(rbind, folds) })
#   user  system elapsed
#   1.11    0.00    1.10

system.time({ x2 <- data.frame(dplyr::rbind_all(folds)) })
#   user  system elapsed
#   0.05    0.00    0.05

system.time({ x3 <- data.frame(plyr::rbind.fill(folds)) })
#   user  system elapsed
#   0.53    0.00    0.54

system.time({ x4 <- data.frame(data.table::rbindlist(folds)) })
#   user  system elapsed 
#   0.02    0.00    0.02

Proof that they all yield the same result:

identical(x1, x2)
# TRUE
identical(x1, x3)
# TRUE
identical(x1, x4)
# TRUE
Alex A.
  • 5,466
  • 4
  • 26
  • 56