There are two convenience functions for between
included in the dplyr
and data.table
packages
between {dplyr}
This is a shortcut for x >= left & x <= right, implemented efficiently in C++ for local values, and translated to the appropriate SQL for remote tables.
between {data.table}
between is equivalent to x >= lower & x <= upper when incbounds=TRUE, or x > lower & y < upper when FALSE
To return the desired values
x[between(x, min(y), max(y))]
Another option using findInterval
x[findInterval(x,y)==1L]
There appears to be a slight (microseconds) speed advantage for findInterval
using the authors original vector
Unit: microseconds
expr min lq mean median uq max neval
dplyr::between 14.078 14.839 20.37472 18.6435 20.5455 60.876 100
data.table::between 58.593 61.637 73.26434 68.2950 78.3780 160.560 100
findInterval 3.805 4.566 6.52944 5.7070 6.6585 35.385 100
updated with large vector
x <- runif(1e8, 0, 10)
y <- c(1, 7)
Results show slight advantage for data.table
with a large vector, but in reality they are close enough that I'd use whatever package you have loaded
Unit: seconds
expr min lq mean median uq max neval
dplyr::between 1.879269 1.926350 1.969953 1.947727 1.995571 2.509277 100
data.table::between 1.064609 1.118584 1.166563 1.146663 1.202884 1.800333 100
findInterval 2.207620 2.273050 2.337737 2.334711 2.393277 2.763117 100
x>=min(y) & x<=max(y) 2.350481 2.429235 2.496715 2.486349 2.542527 2.921387 100