-3

Why has '32' been added to the value of v in the output below?

int main()
{
    float v = 9.999e8;

    std::cout << "v --> " << v << std::endl;

    std::cout << std::fixed;
    std::cout << "v --> " << v << std::endl;

    return 0;
}

output:

v --> 9.999e+08
v --> 999900032.000000
             ^^ 

Is it an artefact of printing the value, or that 9.999e+08 cannot be accurately represented in a float?

Changing v to a double resolves the issue but I want to understand the actual cause.

strong text

hanlonj
  • 319
  • 4
  • 16

2 Answers2

4

With the common desktop implementation of float, the IEEE 754 standard, there are only 24 significand bits (and 8 exponent bits). This means that the biggest consecutive value that can be represented, without precision loss, is 224 which is approximately 1.6E7 (1.6•107). For example, an odd value above this threshold can't be represented accurately by such a float, since the least significant 1-bit "falls off" the representation.

Your value is approximately 1E9 (109), which is 60 times larger. It means that the value can be about 30 off (after rounding to the nearest representable value). Note that the exponent bits contribute 2exp (sans the nitpicking about the offset thing), and not 10exp.

With double there are 53 bits, which are more than enough for 1E9 (109).

Michael Veksler
  • 8,217
  • 1
  • 20
  • 33
2

This is the nature of finite precision math. If you try to store a value that can't be represented exactly, at best you'll get the closest representable value.

Have a look at the binary:
999900032 -> 111011100110010100001110000000
999900000 -> 111011100110010100001101100000

So representing this number exactly would require two additional bits of precision. Presumably, that's more than float provides on your system.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278