How to tell what is in one vector and not another?

Question

In matlab there is a way to find the values in one vector but not in the other.

for example:

x <- c(1,2,3,4)
y <- c(2,3,4)

is there any function that would tell me that the value in x that's not in y is 1?

score 137 · Accepted Answer · answered Dec 03 '09 at 06:53

137

you can use the setdiff() (set difference) function:

> setdiff(x, y)
[1] 1

answered Dec 03 '09 at 06:53

Xela

1,404
1
9
5

43

Watchout: `setdiff(x,y)` and `setdiff(y,x)` are not the same. – Xi'an ні війні Feb 21 '17 at 08:36

score 64 · Answer 2 · edited May 23 '17 at 12:34

64

Yes. For vectors you can simply use the %in% operator or is.element() function.

> x[!(x %in% y)]
1

For a matrix, there are many difference approaches. merge() is probably the most straight forward. I suggest looking at this question for that scenario.

edited May 23 '17 at 12:34

Community

1
1

answered Dec 03 '09 at 06:16

Shane

98,550
35
224
217

Jeromy Anglim · Answer 3 · 2017-10-05T22:56:28.877

32

The help file in R for setdiff, union, intersect, setequal, and is.element provides information on the standard set functions in R.

setdiff(x, y) returns the elements of x that are not in y.

As noted above, it is an asymmetric difference. So for example:

> x <- c(1,2,3,4)
> y <- c(2,3,4,5)
> 
> setdiff(x, y)
[1] 1
> setdiff(y, x)
[1] 5
> union(setdiff(x, y), setdiff(y, x))
[1] 1 5

edited Oct 05 '17 at 22:56

answered Dec 04 '09 at 03:43

Jeromy Anglim

33,939
30
115
173

score 12 · Answer 4 · answered Dec 04 '09 at 14:54

12

x[is.na(match(x,y))]

answered Dec 04 '09 at 14:54

gd047

29,749
18
107
146

Workhorse · Answer 5 · 2018-07-05T14:42:11.337

7

setdiff() is a tricky function because the output is dependent on the order of the input. You can instead write a simple function as such that does the exact opposite of intersect. This is far better.

>difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}

#Now lets test it. 
>x <- c(1,2,3,4)
>y <- c(2,3,4,5)

>difference(x,y)
[1] 1 5

edited Jul 05 '18 at 14:42

answered Jul 03 '18 at 18:59

Workhorse

1,500
1
17
27

score 5 · Answer 6 · answered Oct 17 '19 at 23:06

If:

x <- c(1,2,3,4)
y <- c(2,3,4)

Any of these expressions:

setdiff(x, y)
x[!(x %in% y)]
x[is.na(match(x,y))]
x[!(is.element(x,y))]

will give you the right answer [1] 1, if the goal is to find the values/characters in x, that is not present in y.

However, applying the above expressions can be tricky and can give undesirable results depending on the nature of the vector, and the position of x and y in the expression. For instance, if:

x <- c(1,1,2,2,3,4)
y <- c(2,3,4)

and the goal is just to find the unique values/characters in x, that is not present in y or vice-versa. Applying any of these expressions will still give the right answer [1] 1:

union(setdiff(x, y), setdiff(y, x))

Thanks to contribution of Jeromy Anglim

OR:

difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}
difference(y,x)

Thanks to contribution of Workhouse

How to tell what is in one vector and not another?

6 Answers6

Linked

Related