I am once again asking a R/dataframe-related question. Recently, I have started implementing the merge() function, but I need help with merging some of the identical columns after merging different similar data frames together.
library(dplyr)
library(tidyr)
library(stringr)
DF1 = data.frame(ID=c("A","B","C"), TagNo=c(10,20,20),Info=c("Metal","Plastic","Wood"))
DF2 = data.frame(ID=c("A","D","E"), TagNo=c(10,30,100),Info=c("Wood","Metal","Wood"))
DF3 = data.frame(ID=c("B","C","F"), TagNo=c(20,20,40),Info=c("Glass","Metal","Wood"))
dfmerge <- merge(DF1,DF2,by="ID",all.x=TRUE, all.y=TRUE)
dfmerge <- merge(dfmerge,DF3,by="ID",all.x=TRUE, all.y=TRUE)
The resulting merged dataframe looks like this:
ID | TagNo | Info.x | TagNo.y | Info.y | TagNo | Info |
---|---|---|---|---|---|---|
A | 10 | Metal | 10 | Wood | NA | NA |
B | 20 | Plastic | NA | NA | 50 | Glass |
C | 20 | Wood | NA | NA | 20 | Metal |
D | NA | NA | 30 | Metal | NA | NA |
E | NA | NA | 100 | Wood | NA | NA |
F | NA | NA | NA | NA | 40 | Wood |
I essentially want to condense the TagNo columns into just one Tag column. To highlight something important. Each sample has at least 1 TagNo, and if it has more than 1, it will always be the same. I don't ever have to worry about them being different.
This is the result I need:
ID | TagNo.x | Info.x | Info.y | Info |
---|---|---|---|---|
A | 10 | Metal | Wood | NA |
B | 20 | Plastic | NA | Glass |
C | 20 | Wood | NA | Metal |
D | 30 | NA | Metal | NA |
E | 100 | NA | Wood | NA |
F | 40 | NA | NA | Wood |
Does anyone know how to approach this problem? I appreciate any help. [Column headers aren't a big issue as I can simply fix that after with colnames(dfmerge) <- ...