Managing precision problems in f64 variables: experiencing higher precision than expected

Question

How can I address precision issues with f64 variables in Rust? When subtracting two nearby f64 floats, I am observing a result with higher precision than the f64 format's standard precision. Typically, I would expect around 15-16 decimal places, but I'm getting more than 20. This problem generates further problems with my simulation.

Is there a solution that doesn't involve additional crates, and instead enforces variables to adhere to the standard number of decimals dictated by their format?

Basic example:

fn main() {
    let a: f64 = 0.01053710310220;
    let b: f64 = -0.01053710310221;
    let c: f64 = a+b;
    println!("c = {}", c);
}

c = -0.000000000000010000680839006293

That's not "higher precision". That's round-off garbage. That's the danger of using floating point. None of your numbers can be represented exactly in binary, so what you get is an approximation. Rust happens to show you the exact value that you got. And, btw, that number DOES have 15 digits of precision -- the leading 0s don't count. You just need to use a reasonable format to print it. — Tim Roberts, Aug 23 '23 at 03:18
You can see [here](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=32f6825ee79ba9a164f36d495e60a898) from the exaggerated digits of `a` and `b` that the number that is stored is not *exactly* the same that you typed. If you are unfamiliar with the imprecision of floating point variables, consider reading [Is floating point math broken?](/q/588004/2189130) — kmdreko, Aug 23 '23 at 03:22
Well 2^1000 is exactly representable as a `double` and has 300+ decimal digits, does it mean `double` now has 300 digits of precision? (Hint: it doesn't). — n. m. could be an AI, Aug 23 '23 at 04:36
This is a known effect when working with floating point numbers, they taught us this danger in university. The only way to mitigate it is by using a more stable algorithm, that doesn't suffer from the same problem. Like reordering the operations in your problem to prevent it. Here's more info about the problem: https://en.wikipedia.org/wiki/Catastrophic_cancellation — Finomnis, Aug 23 '23 at 06:25

score 0 · Answer 1 · answered Aug 23 '23 at 05:15

0

Typically, I would expect around 15-16 decimal places, but I'm getting more than 20.

Re-evaluate the case using base-2 math rather than base-10 as the floating point uses a base-2 encoding. You will not find a increase.

answered Aug 23 '23 at 05:15

chux - Reinstate Monica

143,097
13
135
256

Eric Postpischil · Accepted Answer · 2023-08-23T11:08:43.100

When subtracting two nearby f64 floats, I am observing a result with higher precision than the f64 format's standard precision. Typically, I would expect around 15-16 decimal places,…

f64 does not have a “standard precision” of 15-16 decimal places. Its precision is 53 binary digits.

When the source text 0.01053710310220 is compiled, it is converted to the nearest value representable in f64, which is 0.01053710310219999925218647973679253482259809970855712890625, which is 6,074,226,381,392,949•2⁻⁵⁹. (Observe that 6,074,226,381,392,949 fits in 53 bits, the precision of f64.) When -0.01053710310221 is compiled, it is converted to −0.0105371031022099999330254860296918195672333240509033203125, which is −6,074,226,381,398,714•2⁻⁵⁹. When these are added, the result is −5,765•2⁻⁵⁹, which is exactly −0.00000000000001000068083900629289928474463522434234619140625.

You can subtract 6,074,226,381,398,714 from 6,074,226,381,392,949 by eye—most of the leading digits cancel, and it is just 8,714−2,949—and verify the result, 5,765, is correct. And so you can see no extra precision appeared.

… enforces variables to adhere to the standard number of decimals dictated by their format?

There is nothing in the standard for this format, IEEE 754, that dictates a number of decimals.

How can I address precision issues with f64 variables in Rust?

This is too vague and broad a question. Do you want to work with 16-digit decimal numbers? At a fixed scale or at a varying scale? What operations do you need to do with them—just addition, as you have shown, or other elementary arithmetic? Do you need square roots, sines, logarithms, and other functions? What is the range of numbers you need to work with? Do you need exact results or can you accept approximations? How much?

Kevin Reid · Answer 3 · 2023-08-23T15:45:06.947

Is there a solution that … enforces variables to adhere to the standard number of decimals

The thing you are wishing for here is fixed-point arithmetic. It's called “fixed point” because the number of digits before and after “the point” (it's usually binary, not decimal, so I won't say “decimal point”) are fixed, as opposed to “floating point” (f64 and friends) where they are shifted based on the magnitude of the number.

Your 0.000000000000010000680839006293 does have “around 15-16” digits of precision: the zeroes don't count, because the “point” has “floated” leftward. In order to not have this effect, you must do something other than using floating-point.

…that doesn't involve additional crates,…

The Rust standard library does not include fixed-point numeric types. You can:

use an additional crate,
write your own fixed-point type,
or do it implicitly with an ordinary integer type. Fixed-point can be seen as just choosing specific units: store microseconds instead of seconds, or whatever unit and scale applies to your situation. The required scaling factors are just unit conversions.

There is no way to cause f64 to become the type you are imagining. You could round after each arithmetic operation, but that would be easy to get wrong, and it would not give you a fixed number of decimal digits because conventional floating-point numbers are binary, not decimal; therefore, they do not store a whole number of decimal digits in them. There is no f64 value that exactly represents 0.1, let alone 0.00000000000001. (Mathematically, this is because 0.1 = 1/10 = 1/(5 × 2) and binary place-shifting can only divide by 2, not by 5 or 10.)

You mean 0.1 = 1/0 = 1/(5 x **2**), right? – Chayim Friedman Aug 23 '23 at 08:10 — Chayim Friedman, Aug 23 '23 at 08:10
@ChayimFriedman Fixed, thanks. – Kevin Reid Aug 23 '23 at 15:45 — Kevin Reid, Aug 23 '23 at 15:45

Managing precision problems in f64 variables: experiencing higher precision than expected

3 Answers3