Mark common rows between data frames in R

Question

I'm trying to mark rows of a bigger dataframe that are common with a smaller dataframe.

I've looked into similar topics (find the common ids between two data frames in R Find indices of duplicated rows Finding ALL duplicate rows, including "elements with smaller subscripts"), but couldn't figure out how to make it work the following way:

df1<- data.frame(id = c(1, 2, 3, 4, NA, 5, 6, NA, NA, 7, 8))
df2<- data.frame(id = c(NA, 8, 3, NA, 1))

# Result

> df1
   id match
1   1  TRUE
2   2 FALSE
3   3  TRUE
4   4 FALSE
5  NA FALSE
6   5 FALSE
7   6 FALSE
8  NA FALSE
9  NA FALSE
10  7 FALSE
11  8  TRUE

score 3 · Accepted Answer · answered Jun 24 '20 at 09:14

3

You can use %in% to check for matches and is.na to avoid matches with NA.

df1$match <- df1$id %in% df2$id & !is.na(df1$id)
df1

#   id match
#1   1  TRUE
#2   2 FALSE
#3   3  TRUE
#4   4 FALSE
#5  NA FALSE
#6   5 FALSE
#7   6 FALSE
#8  NA FALSE
#9  NA FALSE
#10  7 FALSE
#11  8  TRUE

answered Jun 24 '20 at 09:14

Ronak Shah

377,200
20
156
213

Thank you, this `& !is.na(df$id)` is the part that I was missing in my tries. – blazej Jun 24 '20 at 09:24

Mark common rows between data frames in R

1 Answers1