I'm looking for a way to consistently ignore small differences between floating point numbers in R (these are double precision floating points as per IEC 60559), by using base R tools and without resorting to C or C++. In other words, I would like to "round" the significand portion of the double precision floating point numbers such that things like this return TRUE instead of FALSE:
1.45 - .55 == 2.45 - 1.55
## [1] FALSE
Something like:
round_significand(1.45 - .55, bits=48) == round_significand(2.45 - 1.55, bits=48)
## [1] TRUE
A simple round
doesn't work because the level to which we need to round depends on the magnitude of the number.
data.table
does something of the sort internally, from ?setNumericRounding
:
Computers cannot represent some floating point numbers (such as 0.6) precisely, using base 2. This leads to unexpected behaviour when joining or grouping columns of type 'numeric'; i.e. 'double', see example below. In cases where this is undesirable, data.table allows rounding such data up to approximately 11 s.f. which is plenty of digits for many cases. This is achieved by rounding the last 2 bytes off the significand. Other possible values are 1 byte rounding, or no rounding (full precision, default).
I'm working on a hack implementation that scales everything to be a decimal number x
such that floor(log10(x)) == 1
and rounds that, e.g.:
rnd_sig <- function(x, precision=10) {
exp <- floor(log10(abs(x)))
round(x * 10 ^ (-exp), precision) / 10 ^ (-exp)
}
but I don't know enough about floating point numbers to be sure this is safe (or when it is safe, and not).