-1

although there are many results out there none seems to answer my question below: I have two data frames

df1 = data.frame(x= c("a1","b1","c1","d1","e1","f1"),y =  c("a2","b2","c2","d2","e2","f2"), z = 1:6)

df2 = data.frame(x = c("a1", "b1"), y = c("a2", "b2"))

they return:

> df1
   x  y z
1 a1 a2 1
2 b1 b2 2
3 c1 c2 3
4 d1 d2 4
5 e1 e2 5
6 f1 f2 6


> df2
   x  y
1 a1 a2
2 b1 b2

How can I get the remaining of their inner join, I mean I want to get:

   x  y z

1 c1 c2 3
2 d1 d2 4
3 e1 e2 5
4 f1 f2 6

I tried

subset(df1, (df1$x %in% df2$x) & (df1$y %in% df2$y) )

but it doesn' t work also tried left join, but I keep getting the original data frame df1

Fierce82
  • 408
  • 5
  • 17

1 Answers1

0

Here is a solution using a key made up by combining the first and second columns together and then using the %in% function.

df1 = data.frame(x= c("a1","b1","c1","d1","e1","f1"),y =  c("a2","b2","c2","d2","e2","f2"), z = 1:6)

df2 = data.frame(x = c("a1", "b1"), y = c("a2", "b2"))

#create a unique key for each dataframe
df1key<-paste(df1$x, df1$y)
df2key<-paste(df2$x, df2$y)

#return rows not matching rows in df2
answer<-df1[!(df1key  %in%  df2key),]
Dave2e
  • 22,192
  • 18
  • 42
  • 50