Filtering of two paired columns

Asked Jun 28 '16 at 14:08

Active Jun 28 '16 at 14:08

Viewed 31 times

Consider the following:

> df <- data.frame(x = letters[1:15], y = rep(1:3, 5), z = rep(1:5, 3))
> df
   x y z
1  a 1 1
2  b 2 2
3  c 3 3
4  d 1 4
5  e 2 5
6  f 3 1
7  g 1 2
8  h 2 3
9  i 3 4
10 j 1 5
11 k 2 1
12 l 3 2
13 m 1 3
14 n 2 4
15 o 3 5

I have another data frame, df2, say

> df2 <- data.frame(y = c(2, 3), z = c(2, 5))
> df2
  y z
1 2 2
2 3 5

I would like to filter out the rows of df with y and z values as in df2. That is, the output should be something like

The pairing (y, z) is important. I've tried doing something like

df[!((df$y %in% df2$y) & (df$z %in% df2$z),]

but here's the problem: if I did this, not only would the pairs (2, 2) and (3, 5) be filtered, but (3, 2) and (2, 5) as well, which I do not want to happen.

Obviously, I could concatenate the columns and filter based on that, but I'm wondering if there's a better way to deal with this problem.

asked Jun 28 '16 at 14:08

Clarinetist

1,097
18
46

3

I guess you're looking for what's called an "anti join". Here's one way with dplyr: `dplyr::anti_join(df, df2, by = c("y", "z"))` – talat Jun 28 '16 at 14:18
1

@docendo, you can put your comment as a solution – Colonel Beauvel Jun 28 '16 at 14:20
or `df[!paste(df$y, df$z) %in% paste(df2$y, df2$z),]` – Sotos Jun 28 '16 at 14:20
@docendodiscimus Thank you, I've never heard that term! – Clarinetist Jun 28 '16 at 14:23

Filtering of two paired columns

0 Answers0