-1

Assume we have multiple data frames say df1,df2,df3,... What is the most efficient R way to count the number of rows that are identical across multiple data frames. Nested multiple loops is not the answer, right?

Thanks

Jin
  • 1,203
  • 4
  • 20
  • 44
  • Next time please post a reproducible example. Thanks – RockScience Mar 19 '14 at 04:05
  • possible duplicate of [Find how many times duplicated rows repeat in R data frame](http://stackoverflow.com/questions/18201074/find-how-many-times-duplicated-rows-repeat-in-r-data-frame) – RockScience Mar 19 '14 at 04:07
  • sorry, I thought the statement is clear enough and no example needed. I will add example whenever possible. Thanks for your reminder. – Jin Mar 19 '14 at 04:36

2 Answers2

1
df1=data.frame(A=11:13,B=111:113)   
df2=data.frame(A=22:24,B=222:224)   
df3=data.frame(A=c(33:35,11),B=c(333:335,111))  

if you are happy to bind the data.frame manually:

> df = rbind(df1,df2,df3)

(otherwise you can also use):

> df = do.call(what=rbind,args=mget(paste("df",1:3,sep="")))) 

Then

> library(plyr)  
> ddply(.data=df,.variables=colnames(df),.fun=nrow)  

Where the 3rd column is the number of times each row is repeated

Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
RockScience
  • 17,932
  • 26
  • 89
  • 125
0

A bit hacky, but should work:

df1$comp <- paste(df1$V1,df1$V2,df1$V3,..., df1$Vn, sep="")
df2$comp <- paste(df2$V1,df2$V2,df2$V3,..., df2$Vn, sep="")

etc

Then

# Number of complete rows in df1 that are in df2.
summary(df1$comp %in% df2$comp)
dvanic
  • 545
  • 1
  • 4
  • 17