1

While merging 3 data.frames using plyr library, I encounter some values with the same name but with different values each in different data.frames.

How does the do.call(rbind.fill,list) treat this problem: by arithmetic or geometric average?

Marek
  • 49,472
  • 15
  • 99
  • 121
Yigael Satanower
  • 199
  • 2
  • 3
  • 4

1 Answers1

3

From the help page for rbind.fill:

Combine data.frames by row, filling in missing columns. rbinds a list of data frames 
filling missing columns with NA.

So I'd expect it to fill columns that do not match with NA. It is also not necessary to use do.call() here.

dat1 <- data.frame(a = 1:2, b = 4:5)
dat2 <- data.frame(b = 3:2, c = 8:9)
dat3 <- data.frame(a = 5:6, c = 1:2)

rbind.fill(dat1, dat2, dat3)
   a  b  c
1  1  4 NA
2  2  5 NA
3 NA  3  8
4 NA  2  9
5  5 NA  1
6  6 NA  2

Are you expecting something different?

Chase
  • 67,710
  • 18
  • 144
  • 161
  • hi again. i am working with proteins. so in column 32 of data.frame1 sat i have a protein named QPUYT and – Yigael Satanower May 29 '11 at 04:16
  • how do i merge the same names but different values with geometric average ? that is, i wish the output to have one representative of the same name that has the geometric average as a value, and not just having 2 entries as the ware. in more word, i wish to average the values and leave the names. thanks – Yigael Satanower May 29 '11 at 04:22
  • 1
    @Yigael - Please update your question following the guidelines outlined in this post: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example. You'll get far superior answers compared to me taking a WAG as to what you really want. If I understand your last comment, does `colMeans(dat, na.rm = TRUE)` where `dat` is the `rbind`ed data.frames do what you want? I should also mention that the http://www.bioconductor.org/ project may have tools specifically designed to assist in this type of analysis. – Chase May 29 '11 at 12:59
  • 1
    @yigael merging is what ‘ddply‘ does well. ‘ddply(my data frame,.(namecolumns),summarize,value=geomean(value))‘ – Alex Brown Aug 04 '11 at 06:14