precision of the digits comparison in R

Question

I have a vector of combined p-values

> o=apply(first[,2:11],1,function(x){combine.test(x,method="z.transform")})
> tail(o)
[1] 0.9999999995 1.0000000000 0.9999999997 1.0000000000 0.0002175058 0.9917320029

I want to get rid of those that are equal to 1. However, when I filter for <1 it shows me:

> tail(o)<1
[1] TRUE TRUE TRUE TRUE TRUE TRUE

> tail(o)==1
[1] FALSE FALSE FALSE FALSE FALSE FALSE

It seems that those 1.0000000000 are some strange numbers.

How is it possible to get rid of those strange 1.00000 numbers?

dput(first)
structure(list(Gene = c("ENSMUSG00000092486.1/RP23-3M10.7", "ENSMUSG00000092531.1/AC141469.5", 
"ENSMUSG00000092558.1/Med20", "ENSMUSG00000092586.1/Ly6g6c", 
"ENSMUSG00000092622.1/2410004A20Rik", "ENSMUSG00000092627.1/D130058E05Rik"
), `1` = c(0.999999, 0.116888889291925, 0.999999, 0.999999, 0.0356438866313227, 
0.338819427575004), `2` = c(0.999999, 0.16984670116627, 0.0949427348451135, 
0.999999, 0.0198038633633834, 0.444175650852497), `3` = c(0.337290753228492, 
0.999999, 0.999999, 0.115690963937986, 0.00094912741834492, 0.999999
), `4` = c(0.065059701538611, 0.147390334507149, 0.428856119856378, 
0.999999, 8.0249957121889e-05, 0.999999), `5` = c(0.999999, 0.999999, 
0.0099824266161115, 0.999999, 0.999999, 0.999999), `6` = c(0.999999, 
0.999999, 0.390023754495407, 0.00188057344411906, 0.058035758898251, 
0.44761301524626), `7` = c(0.04315700527774, 0.999999, 0.999999, 
0.999999, 0.214404456827703, 0.146838114471751), `8` = c(0.406400467867621, 
0.482290327519181, 0.44496129797812, 0.4310551014979, 0.344487266646367, 
0.0780371377632325), `9` = c(0.284690064722141, 0.999999, 0.999999, 
0.420531266751804, 0.362998909144492, 0.141348974658222), `10` = c(0.999999, 
0.999999, 0.999999, 0.999999, 0.021530155378956, 0.00713928192385325
), z_trans_combined = c(0.99999999949304, 0.999999999999999, 
0.999999999672598, 0.999999999999986, 0.000217505802858482, 0.991732002864124
), fisher_combined = c(0.571740537425434, 0.871888411704888, 
0.514120936458559, 0.440446119525803, 3.9948288121646e-07, 0.106343021839262
)), .Names = c("Gene", "1", "2", "3", "4", "5", "6", "7", "8", 
"9", "10", "z_trans_combined", "fisher_combined"), row.names = 15096:15101, class = "data.frame")

Possible duplicate of [Why are these numbers not equal?](http://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal) — etienne, Nov 10 '15 at 13:45
@etienne, jogo, there are two separate confusions about floating-point; one is the confusion between representation and underlying value (occurring here), the other is the confusion over non-exactness over floating-point calculations. The one you cited is the latter. I don't know of a canonical dupe for the former. — Ben Bolker, Nov 10 '15 at 13:50
@Anni: could you give us `dput(first)` so we might reproduce the problem ? — etienne, Nov 10 '15 at 13:55

score 1 · Accepted Answer · answered Nov 10 '15 at 14:29

With your code I got this :

o
       15096        15097        15098        15099        15100        15101 
0.9999999995 1.0000000000 0.9999999997 1.0000000000 0.0002175058 0.9917320029

but using as.character I got :

as.character(o)
[1] "0.99999999949304"     "0.999999999999999"    "0.999999999672598"    "0.999999999999986"    "0.000217505802858481"
[6] "0.991732002864123"

we can verify that those values are not exactly equal to 1 using 1-o :

1-o
       15096        15097        15098        15099        15100        15101 
5.069601e-10 6.661338e-16 3.274016e-10 1.409983e-14 9.997825e-01 8.267997e-03

The problem is that you want to suppress some values which will be close enough from 1. You can try to do that using the Rmpfr package :

require(Rmpfr)
mpfr(o,32)==1
[1] FALSE  TRUE FALSE  TRUE FALSE FALSE

because we have this :

mpfr(o,32)
6 'mpfr' numbers of precision  32   bits 
[1]    0.99999999953                1    0.99999999977                1 0.00021750580288    0.99173200293

You will see that your results only depend on the precision you choose (32 here) :

mpfr(o,4)
6 'mpfr' numbers of precision  4   bits 
[1]        1        1        1        1 0.000214        1

mpfr(o,52)
6 'mpfr' numbers of precision  52   bits 
[1]    0.99999999949303975    0.99999999999999933    0.99999999967259834    0.99999999999998579 0.00021750580285848114
[6]    0.99173200286412344

You thus have to choose a precision high enough to keep values such as the first one but low enough to suppress the one too close from 1.

score 0 · Answer 2 · answered Nov 10 '15 at 13:56

0

you can try:

tail(o) < (1 - .Machine$double.eps) # or
tail(o) < (1 - 2*.Machine$double.eps)

answered Nov 10 '15 at 13:56

jogo

12,469
11
37
42

1

what means "not work"? Is there no result? eventually the values which are printed as 1.000000 are too far from 1 (exactly 1). Which treshhold do you want to use? – jogo Nov 10 '15 at 14:08

precision of the digits comparison in R

2 Answers2