13

I'm trying to use the Reduce function in R to use the merge function across multiple dataframes. The problem is, I would like to use the merge function with the argument all=T, and there seems to be nowhere to specify this in the higher-order Reduce function.

So I'd like:

a <- data.frame(id=c(1, 2, 3, 4), a=c('a', 'b', 'c', 'd'))
b <- data.frame(id=c(1, 2, 5, 6), b=c('a', 'b', 'e', 'f'))
c <- data.frame(id=c(3, 4, 5, 6), c=c('c', 'd', 'e', 'f'))

out <- Reduce(merge, list(a, b, c), all=T)

out
  id    a    b   c
1  1    a    a <NA>
2  2    b    b <NA>
3  3    c <NA>   c
4  4    d <NA>   d
5  5 <NA>    e   e
6  6 <NA>    e   e

But because merge defaults to all=F, what I'm getting is:

[1] id a  b  c 
<0 rows> (or 0-length row.names)
Amadou Kone
  • 907
  • 11
  • 21

1 Answers1

33

As far as I know, Reduce can not handle extra parameters to be passed to the function parameter yet. But you can redefine the merge function with customized parameters and pass it as an anonymous function to Reduce:

Reduce(function(x, y) merge(x, y, by = "id", all = T), list(a, b, c))

#  id    a    b    c
#1  1    a    a <NA>
#2  2    b    b <NA>
#3  3    c <NA>    c
#4  4    d <NA>    d
#5  5 <NA>    e    e
#6  6 <NA>    f    f
Psidom
  • 209,562
  • 33
  • 339
  • 356