424

In R, I have an element x and a vector v. I want to find the first index of an element in v that is equal to x. I know that one way to do this is: which(x == v)[[1]], but that seems excessively inefficient. Is there a more direct way to do it?

For bonus points, is there a function that works if x is a vector? That is, it should return a vector of indices indicating the position of each element of x in v.

Joris Meys
  • 106,551
  • 31
  • 221
  • 263
Ryan C. Thompson
  • 40,856
  • 28
  • 97
  • 159
  • 2
    As R is optimized to work with vectors, `which(x == v)[[1]]` is not so very inefficient. It's one comparison (`==`) operator applied to all vector elements and one subsetting on the indices (`which`). That's it. Nothing that should be relevant, as long as you're not running 10.000 repetitions on this function. Other solutions like `match` and `Position` may not return as many data as `which`, but they're not necessarily more efficient. – BurninLeo Oct 11 '15 at 18:09
  • 6
    My question specified that I would prefer a function that was vectorized over x, and `which(x == v)[[1]]` is not. – Ryan C. Thompson Oct 11 '15 at 22:12
  • new question placeholder: – berriz44 Aug 22 '23 at 12:41

4 Answers4

585

The function match works on vectors:

x <- sample(1:10)
x
# [1]  4  5  9  3  8  1  6 10  7  2
match(c(4,8),x)
# [1] 1 5

match only returns the first encounter of a match, as you requested. It returns the position in the second argument of the values in the first argument.

For multiple matching, %in% is the way to go:

x <- sample(1:4,10,replace=TRUE)
x
# [1] 3 4 3 3 2 3 1 1 2 2
which(x %in% c(2,4))
# [1]  2  5  9 10

%in% returns a logical vector as long as the first argument, with a TRUE if that value can be found in the second argument and a FALSE otherwise.

TylerH
  • 20,799
  • 66
  • 75
  • 101
Joris Meys
  • 106,551
  • 31
  • 221
  • 263
  • 1
    I think that an example with c(2,3,3) and c(1,2,3,4) with both match and %in% would be more instructive with fewer changes between the examples. match(c(2,3,3), c(1:4)) returns different results from which(c(2,3,3) %in% c(1:4)) without needing a longer first vector and as many changes from example to example. It's also worth noting that they handle non-matches very differently. – John Apr 07 '11 at 13:30
  • 1
    @John : that's all true, but that is not what the OP asked. The OP asked, starting from a long vector, to find the first match of elements given in another one. And for completeness, I added that if you are interested in all indices, you'll have to use which(%in%). BTW, there is no reason to delete your answer. It's valid information. – Joris Meys Apr 07 '11 at 13:36
  • 3
    I think it would be helpful to stress that the order of the arguments in `match` matters if you want the index of the first occurrence. For your example, `match(x,c(4,8))` gives different results, which is not super obvious at first. – apitsch Jun 11 '17 at 09:20
  • @goldenoslik It helps if you read the help page of `match`. It's all explained there. But I added that piece of information. – Joris Meys Jun 11 '17 at 10:14
33

the function Position in funprog {base} also does the job. It allows you to pass an arbitrary function, and returns the first or last match.

Position(f, x, right = FALSE, nomatch = NA_integer)

pedroteixeira
  • 806
  • 7
  • 7
19

A small note about the efficiency of abovementioned methods:

 library(microbenchmark)

  microbenchmark(
    which("Feb" == month.abb)[[1]],
    which(month.abb %in% "Feb"))

  Unit: nanoseconds
   min     lq    mean median     uq  max neval
   891  979.0 1098.00   1031 1135.5 3693   100
   1052 1175.5 1339.74   1235 1390.0 7399  100

So, the best one is

    which("Feb" == month.abb)[[1]]
augenbrot
  • 3
  • 3
Andrii
  • 2,843
  • 27
  • 33
  • 2
    Your benchmark is based on a length 12 vector and hence not meaningful. Also in your example `which("Feb" == month.abb)` returns `2`–why the `[[1]]` ? – markus Nov 20 '19 at 20:17
  • 1
    @markus this code which("Feb" == month.abb)[[1]] return "2", and this code which(month.abb %in% "Feb") also returns "2". Also, not clear why using vector is not meaningful – Andrii Nov 21 '19 at 16:48
  • 4
    It is not about the vector, but about its length. You should generate a vector of appropriate length and then do a benchmark based on that. Quoting from OPs question _"I know that one way to do this is:_ `which(x == v)[[1]]`, _but that seems excessively inefficient."_ – markus Nov 21 '19 at 22:05
15

Yes, we can find the index of an element in a vector as follows:

> a <- c(3, 2, -7, -3, 5, 2)
> b <- (a==-7)  # this will output a TRUE/FALSE vector
> c <- which(a==-7) # this will give you numerical value
> a
[1]  3  2 -7 -3  5  2
> b
[1] FALSE FALSE  TRUE FALSE FALSE FALSE
> c
[1] 3

This is one of the most efficient methods of finding the index of an element in a vector.

Martin Gal
  • 16,640
  • 5
  • 21
  • 39
CinnamonCubing
  • 353
  • 2
  • 9