0

I need to categorise a numeric vector. I wanted to use the function cut() and was happy with the result. However, digging in the result, I found a miscategorised item. Minimal working example below. Questions step by step in the code:

# What can I change in `classify()` to avoid divergent classification result
# between `(18 - 1.8)/18 * 100` and 90? (see below)
classify <- function(x) cut(x, breaks = c(-Inf, 90, Inf), right = FALSE)

# Lets consider a:
a <- (18 - 1.8)/18 * 100
a
#> [1] 90

# Why `a` is not equal to `90`?
a == 90
#> [1] FALSE

# Why the classification results are different?
classify(a)
#> [1] [-Inf,90)
#> Levels: [-Inf,90) [90, Inf)
classify(90)
#> [1] [90, Inf)
#> Levels: [-Inf,90) [90, Inf)

Created on 2021-05-28 by the reprex package (v0.2.1)

  • Because if you print this: `format(a, digits=20)`, you will see "a" is not quite equal to 90. The numeric precision error is a common issue when converting between double precision to integers in all systems. – Dave2e May 28 '21 at 12:25
  • 1
    Thank you for the answer. Following that answer I will use: `classify <- function(x) cut(round(x, 2), breaks = c(-Inf, 90, Inf), right = FALSE)`. – Francois Collin May 28 '21 at 13:32

0 Answers0