0

I want to scale some numbers with four decimal places to integers. Will using round (or ceiling) while scaling these numbers cause inaccuracies for some numbers?

In ghci: Expecting 1009

ghci> let x = 1.009 * 1000
ghci> x
1008.9999999999999
ghci> round x
1009
ghci> floor x
1008
ghci> ceiling x
1009

Expecting 100900000

ghci> let y = 1.009 * 100000000
ghci> y
1.0089999999999999e8
ghci> round y
100900000
ghci> floor y
100899999
ghci> ceiling y
100900000
pseuyi
  • 11
  • 1
  • Yes. (fillller) – Daniel Wagner Jul 19 '23 at 17:28
  • And it's not really a Haskell thing, that's how floating-point numbers generally work. Try the same thing in, say JavaScript. Or Ruby. Or anything else. – Fyodor Soikin Jul 19 '23 at 17:43
  • Actually, to be more precise: `round`, `floor`, and `ceiling` will not add new inaccuracies. But multiplying by 1000 can. – Daniel Wagner Jul 19 '23 at 17:44
  • @pseuyi, `1008.9999999999999` is a 17 digit value. Given the binary nature of floating point and finite precision, round to 15 or fewer decimal places. – chux - Reinstate Monica Jul 19 '23 at 17:45
  • i'm aware that the calculations with floating-point numbers won't be precise but would the variance be large enough that rounding (with any of those three functions) would not give the expected result? – pseuyi Jul 19 '23 at 18:07
  • @pseuyi what do you mean by rounding would not give the expected result? mathematically, the definitions of both functions ensure that `ceiling x - floor x == 1` on every non integer. – lsmor Jul 19 '23 at 18:19
  • @ismor in the example i gave `floor 1008.9999999999999` gives `1008` instead of `1009`. `let y = x * 100000000` then `round y` -- is this output reliable? – pseuyi Jul 19 '23 at 18:48
  • 2
    Note that the root problem isn't really in the scaling. The source code text `1.009` does not actually stand for a runtime `Double` value equal to exactly 1.009, because there is no such `Double`. `1.009` as a 64-bit float is actually the value `1.008999999999999896971303314785473048686981201171875` (see https://www.exploringbinary.com/floating-point-converter/). Multiplying that by 1000 with **perfect** accuracy would still going to end up as something that starts with `1008.9999...`; the precision you need to result in `1009` was never there. Floats are just like that, in any language. – Ben Jul 20 '23 at 04:01
  • @Ben Well, the scaling is actually available to be part of the root problem IMO. `1000 * 1.009` as a 64-bit float is `1008.9999999999998863131622783839702606201171875`, which isn't the result of multiplying the number you wrote by 1000 with perfect accuracy; that is, the multiplication has introduced additional rounding error. – Daniel Wagner Jul 20 '23 at 14:50
  • @DanielWagner Certainly, scaling introduces *additional* inaccuracy. I just wanted to be clear that FP inaccuracy isn't only about losing precision in *operations*. The original number wasn't precisely equal to what they wrote down, and would demonstrate the same issues with rounding down at the 3rd decimal place unexpectedly producing `1.008` instead of `1.009`; even if they avoid the scaling they still have to worry about this. Whereas 1000 does have a precise `Double` representation, and a decent class of other precisely-representable numbers can be scaled by 1000 with no loss of precision. – Ben Jul 20 '23 at 22:59
  • Two examples of interest. If you say that the way the number displays before multiplication is the "right value", then `round (521.6325*1000 :: Double) /= round (521.6325*1000 :: Rational)`. If you say that the unrounded meaning of the number before multiplication is the "right value", then `round (3398.0725 * 1000 :: Double) /= round (toRational (3398.0725 :: Double) * 1000)`. – Daniel Wagner Jul 21 '23 at 00:47
  • you said: `floor 1008.9999999999999` gives `1008` instead of `1009`, Of course... `floor 1008.9999...` **is** `1008`. Why do you expect it to be `1009`??; it is a wrong answer. I am not talking about haskell anymore, the mathematical result of that computation is `1008` and other answer is wrong. – lsmor Jul 21 '23 at 08:38

1 Answers1

5

For every one of the 10000 numbers in the sequence 0.000, 0.001, 0.002, 0.003 ... 9.999, the function:

scale x = round (1000*x)

will generate the "correct" four digit result. This is because any floating point error introduced by the expression 1000*x will be a tiny fraction, on the order of 1e-13, so if the "correct" result of 1000*x is -- say -- 1009, then the smallest possible answer you could get will be something like:

1008.9999999999999

and the largest possible answer you could get will be something like:

1009.0000000000001

Both of these integers are at least 10,000,000,000 times closer to 1009 than to any other integer, and since round rounds to the closest integer, it will always round to the correct answer.

Note that floor and ceiling are entirely different. They do not round to the closest integer, but rather to the closest integer in a particular direction, so they can clearly give the wrong answer. Specifically, they can give:

floor 1008.9999999999999 == 1008    -- wrong!
ceiling 1009.0000000000001 == 1010  -- wrong!

Scaling using round will work for larger scalings as well, to a point -- specifically, to the point where the scaling is around 1e16. Where the scaling is "only" 1e15, the transformation still works (for all x in the above sequence):

scale x = round (1000000000000000*x)

But, when the scaling is 1e16:

scale x = round (10000000000000000*x)

you start to get wrong answers:

ghci> scale 0.5005
5004999999999999

If you want to get correct answers for all scalings, then do the scaling in two parts:

scale :: RealFrac n => n -> Integer
scale x = 10000000000000000000000000000000000 * round (10000*x)

Here, the expression round (10000*x) reliably scales any number from the sequence 0.000, 0.001, ..., 9.999 to the corresponding integer from the sequence 0, 1, ..., 9999 without the possibility of a floating point error, as explained above. From there, arbitrary precision Integer multiplication by any integer, no matter how large, will be error-free.

K. A. Buhr
  • 45,621
  • 3
  • 45
  • 71
  • 1
    ...but beware that there exist some very interesting and useful numbers that do not fall in the sequence 0.000, 0.001, ... 9.999, and not all of those will scale by 1000 well. – Daniel Wagner Jul 20 '23 at 14:46