1

I have two databases with different numbers of columns. All columns of the second database are included in the second database. The patients in the two databases are also different. I need to merge the two databases. The function merge (or _join of dplyr) will not work in principle since I have to overlay the databases. The binding (rowbind) should not also works cause I have different columns. What is the simple way to do it?

mydata<-data.frame(
  ID=c(1,1,1,2,2),B=rep("b",5),C=rep("c",5),D=rep("d",5)
)

mydata2<-data.frame(ID=c(3,4),B=c("b2","b2"),C=c("c2","c2"))

The expected dataset is this below:

  ID  B  C    D
1  1  b  c    d
2  1  b  c    d
3  1  b  c    d
4  2  b  c    d
5  2  b  c    d
6  3 b2 c2 <NA>
7  4 b2 c2 <NA>
Seydou GORO
  • 1,147
  • 7
  • 13

3 Answers3

1
dplyr::full_join(mydata,mydata2)

seems to work .

1

A mere merge should suffice

merge( mydata, mydata2, all=T )
  ID  B  C    D
1  1  b  c    d
2  1  b  c    d
3  1  b  c    d
4  2  b  c    d
5  2  b  c    d
6  3 b2 c2 <NA>
7  4 b2 c2 <NA>
Andre Wildberg
  • 12,344
  • 3
  • 12
  • 29
1

You can use bind_rows() to combine two data frames having different number of columns. More here

library(dplyr) 

bind_rows(mydata, mydata2)
Rfanatic
  • 2,224
  • 1
  • 5
  • 21