99

In matlab there is a way to find the values in one vector but not in the other.

for example:

x <- c(1,2,3,4)
y <- c(2,3,4)

is there any function that would tell me that the value in x that's not in y is 1?

Jaap
  • 81,064
  • 34
  • 182
  • 193
Tony Stark
  • 24,588
  • 41
  • 96
  • 113

6 Answers6

137

you can use the setdiff() (set difference) function:

> setdiff(x, y)
[1] 1
Xela
  • 1,404
  • 1
  • 9
  • 5
64

Yes. For vectors you can simply use the %in% operator or is.element() function.

> x[!(x %in% y)]
1

For a matrix, there are many difference approaches. merge() is probably the most straight forward. I suggest looking at this question for that scenario.

Community
  • 1
  • 1
Shane
  • 98,550
  • 35
  • 224
  • 217
32

The help file in R for setdiff, union, intersect, setequal, and is.element provides information on the standard set functions in R.

setdiff(x, y) returns the elements of x that are not in y.

As noted above, it is an asymmetric difference. So for example:

> x <- c(1,2,3,4)
> y <- c(2,3,4,5)
> 
> setdiff(x, y)
[1] 1
> setdiff(y, x)
[1] 5
> union(setdiff(x, y), setdiff(y, x))
[1] 1 5
Jeromy Anglim
  • 33,939
  • 30
  • 115
  • 173
12
x[is.na(match(x,y))]
gd047
  • 29,749
  • 18
  • 107
  • 146
7

setdiff() is a tricky function because the output is dependent on the order of the input. You can instead write a simple function as such that does the exact opposite of intersect. This is far better.

>difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}

#Now lets test it. 
>x <- c(1,2,3,4)
>y <- c(2,3,4,5)

>difference(x,y)
[1] 1 5
Workhorse
  • 1,500
  • 1
  • 17
  • 27
5

If:

x <- c(1,2,3,4)
y <- c(2,3,4)

Any of these expressions:

setdiff(x, y)
x[!(x %in% y)]
x[is.na(match(x,y))]
x[!(is.element(x,y))]

will give you the right answer [1] 1, if the goal is to find the values/characters in x, that is not present in y.

However, applying the above expressions can be tricky and can give undesirable results depending on the nature of the vector, and the position of x and y in the expression. For instance, if:

x <- c(1,1,2,2,3,4)
y <- c(2,3,4)

and the goal is just to find the unique values/characters in x, that is not present in y or vice-versa. Applying any of these expressions will still give the right answer [1] 1:

union(setdiff(x, y), setdiff(y, x))

Thanks to contribution of Jeromy Anglim

OR:

difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}
difference(y,x)

Thanks to contribution of Workhouse

William
  • 340
  • 7
  • 17