5

I am trying to find an efficient (fast in running and simple in coding) way to do the rbind.fill function but in base R. From my searching, there seems to be plenty of library functions such as smartbind, bind_rows, and rbind on data.table, though, as stated before, I need a solution in base R. I found using:

df3 <- rbind(df1, df2[, names(df1)])

From an answer to this question, but it removes the columns while I want extra columns filled with NA to be added instead.

It would also be nice if this method works on an empty data.frame and a populated one too, essentially just setting returning the populated one. (this is for the sake of simplicity, but if it's not possible it's not hard to just replace the variable with the new data.frame if it's empty.

I would also like it to bind by column names for the columns which are labeled the same. Additionally, the first data frame can be both bigger and smaller than the second one, and both may have columns the other does not have.

Here is an example input and output I would like (I just made up the numbers they don't really matter).

#inputs
a <- data.frame(aaa=c(1, 1, 2), bbb=c(2, 3, 3), ccc=c(1, 3, 4))
b <- data.frame(aaa=c(8, 5, 4), bbb=c(1, 1, 4), ddd=c(9, 9, 9), eee=(1, 2, 4))
#desired output
aaa bbb ccc ddd eee
1   2   1   NA  NA
1   3   3   NA  NA
2   3   4   NA  NA
8   1   NA  9   1
5   1   NA  9   2
4   4   NA  9   4
user438383
  • 5,716
  • 8
  • 28
  • 43
  • 2
    You mention several features you want. If you add a concrete example with expected output, we'll have a clearer idea of what you mean exactly. Some guidance if you're interested: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/28481250#28481250 – Frank Aug 07 '18 at 15:47
  • Further, have you tried anything (besides simple `rbind`)? It might be easy-enough to do, but please do not consider SO a free-code-service. (Somebody might do it anyway, but it goes a long way to show effort on your end.) – r2evans Aug 07 '18 at 15:48
  • I've updated my answer to (hopefully) address some of your comments! –  Aug 07 '18 at 16:04
  • 1
    Are you just asking for a reimplementation of `rbind.fill` in base R? If so, why reinvent this? What does it do/not do already that you need? – Aaron left Stack Overflow Aug 07 '18 at 16:11
  • @Aaron I'm writing this code for a server which has very specific restrictions on what can be on there. It's not really that I need any _new_ features. I guess I am just asking for the implementation of rbind.fill, but I'm just surprised (probably because I haven't been using R enough) that there isn't really a base "R-style" way of doing this without defining new functions. –  Aug 07 '18 at 18:25

2 Answers2

5

I don't know how efficient it may be, but one simple way to code this would be to add the missing columns to each data frame and then rbind together.

rbindx <- function(..., dfs=list(...)) {
  ns <- unique(unlist(sapply(dfs, names)))
  do.call(rbind, lapply(dfs, function(x) {
    for(n in ns[! ns %in% names(x)]) {x[[n]] <- NA}; x }))
}

a <- data.frame(aaa=c(1, 1, 2), bbb=c(2, 3, 3), ccc=c(1, 3, 4))
b <- data.frame(aaa=c(8, 5, 4), bbb=c(1, 1, 4), ddd=c(9, 9, 9), eee=c(1, 2, 4))
rbindx(a, b)

#   aaa bbb ccc ddd eee
# 1   1   2   1  NA  NA
# 2   1   3   3  NA  NA
# 3   2   3   4  NA  NA
# 4   8   1  NA   9   1
# 5   5   1  NA   9   2
# 6   4   4  NA   9   4
Aaron left Stack Overflow
  • 36,704
  • 7
  • 77
  • 142
3

Just use rbind.fill. If you can't install the plyr package, pull out the parts you need.

rbind.fill seems to have very few internal dependencies: plyr::compact is a one-liner, plyr:::output_template depends on plyr:::allocate_column, but at a glance that looks like it is all base code. So copy those 4 functions (attribute the source and make sure the license is compatible with your use - the current version on CRAN uses the MIT license which is quite permissive, you just need to keep it MIT-licensed), and then you have the real implementation of rbind.fill.

Why take this approach? Because, as Aaron points out - you know it works. It's been used and debugged for years.

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294