1

I want to know the exact elements of a vector not found in the vector. For instance, consider the following vectors:

veca <- c("ab", "cd", "ef", "gh", "ij", "kl")
vecb <- c("ab", "ef", "ij", "kl")

From this, cd and gh are in veca but not in vecb. How can one identify these elements in R? Thanks!

iGada
  • 599
  • 3
  • 9
  • 1
    `not_in_vecb <- veca[!veca %in% vecb]` – TarJae Apr 17 '23 at 20:14
  • Does this answer your question? [Compare two character vectors in R](https://stackoverflow.com/questions/17598134/compare-two-character-vectors-in-r) – S-SHAAF Apr 17 '23 at 21:10

4 Answers4

3

We could define a custom function like an opposite intersect function using setdiff:

learned here:

outersect <- function(x, y) {
  sort(c(setdiff(x, y),
         setdiff(y, x)))
}

outersect(veca, vecb)

output:

[1] "cd" "gh"

Another possible solution is:

not_in_vecb <- veca[!veca %in% vecb]

[1] "cd" "gh"
TarJae
  • 72,363
  • 6
  • 19
  • 66
2

You can use %in% operator

Identifying elements in veca that are not in vecb

> veca[!(veca %in% vecb)]
[1] "cd" "gh"
2

Here we consider a bit more general case

veca <- c("ab", "cd", "ef", "gh", "ij", "kl") # "cd" and "gh" are not in `vecb`
vecb <- c("ab", "ef", "ij", "kl", "xy") # "xy" is not in `veca`

Below are some options we can try

  1. set operations: union, setdiff, and intersect
> setdiff(union(veca, vecb), intersect(veca, vecb))
[1] "cd" "gh" "xy"
  1. stack + subset to filter out whose occurrence is just 1
subset(
  aggregate(
    ind ~ .,
    stack(list(a = veca, b = vecb)),
    unique
  ),
  lengths(ind) == 1,
  select = values
)

which gives

  values
2     cd
4     gh
7     xy
ThomasIsCoding
  • 96,636
  • 9
  • 24
  • 81
1

You can use the builtin function setdiff()

Example:

veca <- c("ab", "cd", "ef", "gh", "ij", "kl")
vecb <- c("ab", "ef", "ij", "kl")


print(setdiff(veca, vecb))

Output:

[1] "cd" "gh"
stressed
  • 328
  • 2
  • 7