I have two Excel files, I want to remove rows from the first file excel1 who are not in the second file excel2 with an R insctruction.
Here is my escounted result. What should I do ?
You can read the csv files first
excel1<-read.csv("excel1.csv", header=T)
excel2<-read.csv("excel2.csv", header=T)
excel1.excel2<-setdiff(excel1, excel2)
You can also refer to this post to help you prepare reproducible examples: How to make a great R reproducible example?
I finally get this code. Of course maybe there are better approaches, but this do the trick you'd like!
excel.1 <- data.frame(V1 = c(4,4,8,6,7), V2 = c(5,3,6,9,2))
excel.2 <- data.frame(V1 = c(7,8,4), V2 = c(2,6,3))
a <- expand.grid(1:nrow(excel.1), 1:nrow(excel.2))
a <- t(a)
log.vec <- matrix(nrow = ncol(a), ncol = ncol(excel.2))
for (comb in 1:ncol(a)){
log.vec[comb, ] <- excel.1[a[1, comb], ] == excel.2[a[2, comb], ]
}
log.vec <- cbind(log.vec, rowSums(log.vec))
equal <- a[, log.vec[, 3] == 2]
new.matrix <- excel.1[-equal[1,], ]