0

I am confused about the which function. Basically I thought that it checks at which position of an input object (e.g., a vector) a logical condition is true. As seen in the documentation:

which(LETTERS == "R")
[1] 18

In other words, it goes through all LETTERS values and checks if value == R. But this seems to be a misunderstanding. If I input

a <- c("test","test2","test3","test4")
b <- c("test","test3")
which(a==b)
[1] 1

it returns [1] 1 although test3 does also appear in both vectors. Also, if I input a shorter vector for a, it returns a warning:

a <- c("test","test2","test3")
b <- c("test","test3")
which(a==b)
[1] 1

Warning message:
In a == b : longer object length is not a multiple of shorter object length

My question here is twofold:

  1. How can I return the positions of a character vector a that match a character vector b?

  2. How does which() operate because I obviously misunderstand the function.

Thank you for your answers

Edit: thank you for your quick replies, you clarified my misunderstanding!

Ian Kemp
  • 28,293
  • 19
  • 112
  • 138
00schneider
  • 698
  • 9
  • 21

2 Answers2

3

== compares values 1 by 1 (a[1]==b[1]);(a[2]==b[2])..... and not as sets.

for set operations use %in%

use a[which(a %in% b)] to get [1] "test" "test3"

which() returns the index of TRUE expressions (!) not the value.

which(a %in% b) will return

[1] 1    3

the reason for the strange warning message is R's recycling

Warning message:
In a == b : longer object length is not a multiple of shorter object length

so when you compare a vector of length 4 with a vector of length 2, value by value (using == ), R 'recycles' the short vector. in 4 and 2 it works and you will get an answer for this question: (a1==b1,a2==b2,a3==b1,a4==b2). in case of length 4 and 3 - you get a warning message saying the short vector cannot be multiplied by an integer to get the long vector length.

Axeman
  • 32,068
  • 8
  • 81
  • 94
Zahiro Mor
  • 1,708
  • 1
  • 16
  • 30
3

You need to give which an input that tells it what elements of a are in b:

which(a%in%b)
[1] 1 3

which essentially identifies which elements are TRUE in a logical vector.

James
  • 65,548
  • 14
  • 155
  • 193