Labeling float
or double
as having 7 or 15 digits is a rough estimate at best and should not be used as any sort of numerical analysis for the precision.
To evaluate precision and numerical effects, you should always consider the float
and double
types to be binary numerals with 24 and 53 bits of precision, because that is how they are actually represented. Binary and digital representations are incommensurate in various ways, so trying to understand or analyze the behavior is decimal makes it hard to understand the binary effects.
The numbers you are looking at std::numeric_limits<Type>::max_digits10
, which are 9 and 17 for the typical float
and double
formats, are not intended to be measures of precision. They are essentially meant to solve this problem:
- I need to write a floating-point number in decimal to a file and later read the decimal numeral from that file back into a floating-point number. How many decimal digits do I need to write to guarantee that reading it back will restore the original number?
It is not a measure of the accuracy of the floating-point format. It includes some “extra” digits that are caused by the discrepancy between binary and decimal, to allow for the fact that they are “offset” in a certain sense and do not line up. It is as if you have an oddly shaped box you are trying to put into a rectangular box—you need a box that actually has more area than the oddly shaped box because the fit is not perfect. Similarly, the max_digits10
specifies more decimal digits than the actual information content of the floating-point type. So it is not a correct measure of the precision of the type.
The parameters that give you information about the precision of the type are std::numeric_limits<Type>::radix
and std::numeric_limits<Type>::digits
. The first is the numeric base used for floating-point, which is 2 in common C++ implementations. The second is the number of digits the floating-point type has. Those are the actual digits used in the floating-point format; its significand is a numeral formed of digits base-radix digits. For example, for common float
and double
types, radix
is 2, and digits
is 24 or 53, so they use 24 base-two digits and 53 base-two digits, respectively.