Finding which element of a vector is between two values in R

Question

I have two vectors x and y. I would like to find which elements of x are between the two elements of vector y. How can I do it in R?

x = c( .2, .4, 2.1, 5.3, 6.7, 10.5)
y = c( 1, 7)

I have written the following code, but it does not give me the correct result.

> x = x[ x >= y[1] && x <= y[2]]
> x
numeric(0)

Result should be like this:

res = c(2.1, 5.3, 6.7)

Future readers may also be interested in `findInterval`, which isn't quite what is needed here, but is another tool to find which two values a number is between. — Aaron left Stack Overflow, Dec 24 '13 at 01:05
Also see [this related question](http://stackoverflow.com/q/12946070/210673) and more info on `&` and `&&` in [this question](http://stackoverflow.com/a/6559049/210673). — Aaron left Stack Overflow, Dec 24 '13 at 01:07

score 11 · Accepted Answer · edited Dec 24 '13 at 13:55

You are looking for &, not &&:

x = c( .2, .4, 2.1, 5.3, 6.7, 10.5)
y = c( 1, 7)
x = x[ x >= y[1] & x <= y[2]]
x
# [1] 2.1 5.3 6.7

Edited to explain. Here's the text from ?'&' .

& and && indicate logical AND and | and || indicate logical OR. 
The shorter form performs elementwise comparisons in much the same way as arithmetic operators. 
The longer form evaluates left to right examining only the first element of each vector. 
Evaluation proceeds only until the result is determined.

So when you used && , it returned FALSE for the first element of your x and terminated.

score 5 · Answer 2 · edited Jun 20 '20 at 09:12

There are two convenience functions for between included in the dplyr and data.table packages

between {dplyr}

This is a shortcut for x >= left & x <= right, implemented efficiently in C++ for local values, and translated to the appropriate SQL for remote tables.

between {data.table}

between is equivalent to x >= lower & x <= upper when incbounds=TRUE, or x > lower & y < upper when FALSE

To return the desired values

x[between(x, min(y), max(y))]

Another option using findInterval

x[findInterval(x,y)==1L]

There appears to be a slight (microseconds) speed advantage for findInterval using the authors original vector

Unit: microseconds

               expr    min     lq     mean  median      uq     max neval
dplyr::between      14.078 14.839 20.37472 18.6435 20.5455  60.876   100
data.table::between 58.593 61.637 73.26434 68.2950 78.3780 160.560   100
findInterval         3.805  4.566  6.52944  5.7070  6.6585  35.385   100

updated with large vector

x <- runif(1e8, 0, 10)
y <- c(1, 7)

Results show slight advantage for data.table with a large vector, but in reality they are close enough that I'd use whatever package you have loaded

Unit: seconds

              expr         min       lq     mean   median       uq      max neval
dplyr::between        1.879269 1.926350 1.969953 1.947727 1.995571 2.509277   100
data.table::between   1.064609 1.118584 1.166563 1.146663 1.202884 1.800333   100
findInterval          2.207620 2.273050 2.337737 2.334711 2.393277 2.763117   100
x>=min(y) & x<=max(y) 2.350481 2.429235 2.496715 2.486349 2.542527 2.921387   100

@Frank the benchmark has been updated with a 1e8 length vector. The results for a large vector favor `data.table` but will likely be consistent with what package you're already using with your data. — manotheshark, Dec 17 '16 at 19:39

G. Cocca · Answer 3 · 2016-12-17T22:48:35.467

0

If y has more than two elements, it could come in handy:

x[x>=range(y)[1] & x<=range(y)[2]]

edited Dec 17 '16 at 22:48

answered Dec 17 '16 at 22:38

G. Cocca

2,456
1
12
13

Finding which element of a vector is between two values in R

3 Answers3

Linked

Related