How would I find matches between records in two lists based on a combination of variables in R?

Question

I have two data frames...

> dim(df.x)
[1] 2120   5
> dim(df.y)
[1] 125    3

I'd like to identify records in data frame x that match data frame y for both variable 1 and variable 2 (but not for any other variables). I suppose the typical way to do this in a lot of languages would be to do nested for statements and to compare each record in x to each record in y and stop and index the hits. But I'm wondering if there's a more efficient way to do this in R.

(I'd prefer to stick to base R or "out-of-the-box" R, if possible, rather than some of the higher-level packages.)

It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. You can probably just `merge()` to get the overlap with base R. — MrFlick, Oct 22 '19 at 19:31

score 0 · Accepted Answer · answered Oct 22 '19 at 19:34

0

you can use merge() from base-R which gives an inner join by default. The code would be something like:

common = merge(df.x,df.y,by=c("var1","var2"))

var1 and var2 are your variables.

answered Oct 22 '19 at 19:34

makeshift-programmer

489
3
8

How would I find matches between records in two lists based on a combination of variables in R?

1 Answers1