I'm trying to merge two data.frames
in R
d1 <- data.frame(Id=1:3,Name=c("Yann","Anne","Sabri"),Age=c(21,19,31),Height=c(178,169,192),Grade=c(15,12,18))
d2 <- data.frame(Id=c(1,3,4),Name=c("Yann","Sabri","Jui"),Age=c(28,21,15),Sex=c("M","M","F"),City=c("Paris","Paris","Toulouse"))
I'd like to merge by Id
, and retaining only Id
, Name
, Age
, Sex
and Grade
columns in the final data.frame
.
I've come up with a lengthy code that does the job, but is there any better way?
dm <- data.frame(Id=unique(c(d1$Id,d2$Id)))
dm.d1.rows <- sapply(dm$Id, match, table = d1$Id)
dm.d2.rows <- sapply(dm$Id, match, table = d2$Id)
for(i in c("Name", "Age","Sex","Grade")) {
if(i %in% colnames(d1) && is.factor(d1[[i]]) || i %in% colnames(d2) && is.factor(d2[[i]])) dm[[i]]<- factor(rep(NA,nrow(dm)),
levels=unique(c(levels(d1[[i]]),levels(d2[[i]]))))
else dm[[i]]<- rep(NA,nrow(dm))
if(i %in% colnames(d1)) dm[[i]][!is.na(dm.d1.rows)] <- d1[[i]][na.exclude(dm.d1.rows)]
if(i %in% colnames(d2)) dm[[i]][!is.na(dm.d2.rows)] <- d2[[i]][na.exclude(dm.d2.rows)]
}