I am working on boolean rules related to terminal node assignment for CART-like trees related to my work (http://web.ccs.miami.edu/~hishwaran/ishwaran.html)
I have noticed problematic behavior in evaluating inequalities of character strings using eval and parse of text. The issue has to do with how R evaluates the internal representation of a number.
Here's an example involving the number pi. I want to check if a vector (which I call x) is less than or equal to pi.
> pi
> [1] 3.141593
> rule = paste0("x <= ", pi)
> rule
> [1] "x <= 3.14159265358979"
This rule checks whether the object x is less than pi where pi is represented to 14 digits. Now I will assign x to the values 1,2,3 and pi
> x = c(1,2,3,pi)
Here's what x is up to 15 digits
> print(x, digits=15)
> [1] 1.00000000000000 2.00000000000000 3.00000000000000 3.14159265358979
Now let's evaluate this
> eval(parse(text = rule))
> [1] TRUE TRUE TRUE FALSE
Whooaaaaa, it looks like pi is not less than or equal to pi. Right?
But now if I hard-code x to pi to 14 digits, it works:
> x = c(1,2,3,3.14159265358979)
> eval(parse(text = rule))
[1] TRUE TRUE TRUE TRUE
Obviously in the first case, the internal representation for pi has many digits and so when R evaluates the expression, it is greater than the float representation and it returns FALSE. In the second case it compares two floats, so the result is true.
However, how to avoid this happening? I really need the first evaluation to come back true because I am automating this process for rule based inference and I cannot hard code a value (here this being pi) each time.
One solution I use is to add a small tolerance value.
> tol = sqrt(.Machine$double.eps)
> rule = paste0("x <= ", pi + tol)
> x = c(1,2,3,pi)
> eval(parse(text = rule))
> [1] TRUE TRUE TRUE TRUE
However, this seems like an ugly solution.
Any comments and suggestions are greatly appreciated!