In matlab there is a way to find the values in one vector but not in the other.
for example:
x <- c(1,2,3,4)
y <- c(2,3,4)
is there any function that would tell me that the value in x
that's not in y
is 1?
In matlab there is a way to find the values in one vector but not in the other.
for example:
x <- c(1,2,3,4)
y <- c(2,3,4)
is there any function that would tell me that the value in x
that's not in y
is 1?
you can use the setdiff() (set difference) function:
> setdiff(x, y)
[1] 1
Yes. For vectors you can simply use the %in%
operator or is.element()
function.
> x[!(x %in% y)]
1
For a matrix, there are many difference approaches. merge()
is probably the most straight forward. I suggest looking at this question for that scenario.
The help file in R for setdiff, union, intersect, setequal, and is.element provides information on the standard set functions in R.
setdiff(x, y)
returns the elements of x
that are not in y
.
As noted above, it is an asymmetric difference. So for example:
> x <- c(1,2,3,4)
> y <- c(2,3,4,5)
>
> setdiff(x, y)
[1] 1
> setdiff(y, x)
[1] 5
> union(setdiff(x, y), setdiff(y, x))
[1] 1 5
setdiff()
is a tricky function because the output is dependent on the order of the input. You can instead write a simple function as such that does the exact opposite of intersect
. This is far better.
>difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}
#Now lets test it.
>x <- c(1,2,3,4)
>y <- c(2,3,4,5)
>difference(x,y)
[1] 1 5
If:
x <- c(1,2,3,4)
y <- c(2,3,4)
Any of these expressions:
setdiff(x, y)
x[!(x %in% y)]
x[is.na(match(x,y))]
x[!(is.element(x,y))]
will give you the right answer [1] 1
, if the goal is to find the values/characters in x
, that is not present in y
.
However, applying the above expressions can be tricky and can give undesirable results depending on the nature of the vector, and the position of x and y in the expression. For instance, if:
x <- c(1,1,2,2,3,4)
y <- c(2,3,4)
and the goal is just to find the unique values/characters in x
, that is not present in y
or vice-versa. Applying any of these expressions will still give the right answer [1] 1
:
union(setdiff(x, y), setdiff(y, x))
Thanks to contribution of Jeromy Anglim
OR:
difference <- function(x, y) {
c(setdiff(x, y), setdiff(y, x))
}
difference(y,x)
Thanks to contribution of Workhouse