3

I observe this:

> class(x)
[1] "numeric"
> str(x) 
num [1:2500] 1 1 1 1 1 1 1 1 1 1 ...
> table(x)
   1 
2500 
> table(x == 1)
FALSE  TRUE 
  299  2201 
> all.equal(x, rep(1,length(x)))
[1] TRUE
> dput(x)
c(1, ..... 1)  # all ones

how is this possible? I understand that floating point numbers should not be compared using == in general, but shouldn't table be consistent with ==?

PS. Apparently, table is consistent with all.equal and not with == because it converts its arguments to factors (i.e., strings) first.

PPS. table(x-1) shows the non-0 values.

Community
  • 1
  • 1
sds
  • 58,617
  • 29
  • 161
  • 278

2 Answers2

9

Where in the documentation is it promised that they would be consistent? table expects "one or more objects which can be interpreted as factors", i.e., internally does factor(x), which turns x first into a character and then into a factor.

x <- 1 - 1e-16
x == 1
#[1] FALSE
as.character(x)
#1] "1"
factor(x) == "1"
#[1] TRUE
Roland
  • 127,288
  • 10
  • 191
  • 288
1

Just addressing a possible misunderstanding about what all.equal does. table is not consistent with all.equal, because the latter by default includes a tolerance factor when comparing numeric values. From ?all.equal:

tolerance
numeric ≥ 0. Differences smaller than tolerance are not reported. The default value is close to 1.5e-8.

That is, all.equal should really be interpreted as meaning "all approximately equal" (to within a given limit of numerical precision).

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187