6

I have a question regarding selecting specific values from a vector in R. More specifically, I want to select all integer values from a given variable in my dataset (I want to use these to subset my data). Here is an example:

x <- seq(0,10,1/3)

Now I want to select all the observations in the vector x with integer numbers. My first idea was to use the is.integercommand, but this does not work. I found a workaround solution using the following:

> x==as.integer(x)
 [1]  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE
FALSE TRUE FALSE FALSE  TRUE
[17]  FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE FALSE  TRUE FALSE 
FALSE  TRUE FALSE FALSE  TRUE

Now I can simply type

> which(x==as.integer(x))
 [1]  1  4  7 10 13 16 19 22 25 28 31

and I get the expected result (and I can use this vector for subsetting my dataset). But isn't there a more direct way to select integer values?

Flow
  • 735
  • 2
  • 7
  • 17
  • 2
    `is.integer` acts on the whole object instead of a single element and here it is `numeric` instead of `integer` class. I think your function looks good. A bit more compact would be `which(x==x%/%1)` – akrun May 27 '15 at 08:14
  • 2
    you can also use `round`, I don't know which one is more efficient – Cath May 27 '15 at 08:15
  • @CathG But I remembered that it might give different results with edge cases. – akrun May 27 '15 at 08:19
  • 4
    @DavidArenburg code golfing `which(!x%%1)` – akrun May 27 '15 at 08:21
  • 2
    you dont really need which() as you can subset with a boolean vector. – JohannesNE May 27 '15 at 08:22
  • 3
    so `x[!x%%1]` is the solution ;-) – Cath May 27 '15 at 08:23
  • Thank you everyone for your quick help! I realize that in the vector x there are technically no integers, but I thought there might be something implemented to recognize whole numbers. Thanks for sharing your other solutions! I didn't know the operators `%/%` and `%%`. – Flow May 27 '15 at 08:23
  • 4
    Am I the only one who thinks this is dangerous? Values created with seq() using non-integer steps would not seem guaranteed to always land on integer increments. Shouldn't you be using `abs(x-round(x)) <= fuzz` as the test? – IRTFM May 27 '15 at 08:37
  • plyr::round_any(x,1) == x – Nitro Dec 29 '19 at 17:31

2 Answers2

8

This is a counter example to the suggestion to use modulo operators:

> x <-  seq(1/3, 9 , 1/3)
> x[!x%%1]
[1] 1 3 4 9
> x
 [1] 0.3333333 0.6666667 1.0000000 1.3333333 1.6666667 2.0000000
 [7] 2.3333333 2.6666667 3.0000000 3.3333333 3.6666667 4.0000000
[13] 4.3333333 4.6666667 5.0000000 5.3333333 5.6666667 6.0000000
[19] 6.3333333 6.6666667 7.0000000 7.3333333 7.6666667 8.0000000
[25] 8.3333333 8.6666667 9.0000000

There are many examples of similar questions on SO about why not to make that assumption that integers will reliably result from typical operations on numeric values. The canonical warning is R-FAQ 7.31. On my device this is found in the R help page: 7.31 Why doesn't R think these numbers are equal?. A more reliable approach would be:

> x[ abs(x-round(x) ) < 0.00000001 ]
[1] 1 2 3 4 5 6 7 8 9
IRTFM
  • 258,963
  • 21
  • 364
  • 487
4

Although your solution is already a good one, here is another way of doing it, summing up all the comments that emerged from your question:

x <- seq(0, 10, 1/3)

# selecting elements of x for which the rest of the eucliean division (by 1) is not 0
x[!x%%1] 
#[1]  0  1  2  3  4  5  6  7  8  9 10

NB: Because of how floats are stored, this answer (and your solution also) may sometimes fail, see @BondedDust 's answer

To make sure everything goes well, we need to add a "tolerance part" to the answer, which results in a more complicated, but always accurate, answer:

tol <- 1e-12
x[sapply(x, function(y) min(abs(c(y%%1, y%%1-1))) < tol)]

with BondedDust example

x <-  seq(1/3, 9 , 1/3)
x[sapply(x, function(y) min(abs(c(y%%1, y%%1-1))) < tol)]
[1] 1 2 3 4 5 6 7 8 9
Cath
  • 23,906
  • 5
  • 52
  • 86
  • Sorry. Need to downvote because it propagates a dangerous set of misconceptions about numerical operations. – IRTFM May 27 '15 at 08:48
  • @BondedDust, don't worry, I thought you would and the warning is totally justified. To be honest, I don't really like using integer division with non integers... – Cath May 27 '15 at 08:50
  • @BondedDust, I edited my answer with a warning and a solution taking a tolerance into account. I hope you'll find it less dangerous like that – Cath May 27 '15 at 09:34