I would like to know if there is some way to know in advance which real value (in decimal system) would be represented in an imprecise way like the 159.95 number.
In another answer I semiseriously answered "all of them",
but let's look at it another way. Specifically, let's look at
which numbers can be exactly represented.
The key fact to remember is that floating point formats use binary.
(The major, popular formats, anyway.) So the numbers that can be
represented exactly are the ones with exact binary representations.
Here is a table of a few of the single-precision float
values
that can be represented exactly, specifically the seven
contiguous values near 1.0.
I'm going to show them as hexadecimal fractions, binary
fractions, and decimal fractions.
(That is, along each horizontal row, all three values are exactly
the same, just represented in different bases. But note that the
fractional hexadecimal and binary representations I'm using here
are not directly acceptable in C.)
hexadecimal |
binary |
decimal |
delta |
0x0.fffffd |
0b0.111111111111111111111101 |
0.999999821186065673828125 |
5.96e-08 |
0x0.fffffe |
0b0.111111111111111111111110 |
0.999999880790710449218750 |
5.96e-08 |
0x0.ffffff |
0b0.111111111111111111111111 |
0.999999940395355224609375 |
5.96e-08 |
0x1.000000 |
0b1.00000000000000000000000 |
1.000000000000000000000000 |
|
0x1.000002 |
0b1.00000000000000000000001 |
1.000000119209289550781250 |
1.19e-07 |
0x1.000004 |
0b1.00000000000000000000010 |
1.000000238418579101562500 |
1.19e-07 |
0x1.000006 |
0b1.00000000000000000000011 |
1.000000357627868652343750 |
1.19e-07 |
There are several things to notice about this table:
- The decimal numbers look pretty weird.
- The hexadecimal and binary numbers look pretty normal, and show pretty clearly that single-precision floating point has 24 bits of precision.
- If you look at the decimal column, the precision seems to be about equivalent to 7 decimal digits.
- It's clearly not exactly 7 digits, though.
- The difference between consecutive values less than 1.0 is about 0.00000005, and greater than 1.0 is twice that, about 0.00000010. (More on this later.)
Here is a similar table for type double
.
(I'm showing fewer columns because there's not enough room
horizontally for everything.)
hexadecimal |
decimal |
delta |
0x0.ffffffffffffe8 |
0.99999999999999966693309261245303787291049957275390625 |
1.11e-16 |
0x0.fffffffffffff0 |
0.99999999999999977795539507496869191527366638183593750 |
1.11e-16 |
0x0.fffffffffffff8 |
0.99999999999999988897769753748434595763683319091796875 |
1.11e-16 |
0x1.0000000000000 |
1.0000000000000000000000000000000000000000000000000000 |
|
0x1.0000000000001 |
1.0000000000000002220446049250313080847263336181640625 |
2.22e-16 |
0x1.0000000000002 |
1.0000000000000004440892098500626161694526672363281250 |
2.22e-16 |
0x1.0000000000003 |
1.0000000000000006661338147750939242541790008544921875 |
2.22e-16 |
You can see right away that type double
has much better precision:
53 bits, or about 15 decimal digits' worth instead of 7, and with a much
finer spacing between "adjacent" numbers.
What does it mean for these numbers to be "contiguous" or
"adjacent"? Aren't real numbers continuous? Yes, true real
numbers are continuous, but we're not looking at true real
numbers: we're looking at finite-precision floating point, and we
are, literally, seeing the finite limit of the precision here.
In type float
, there simply is no value — no representable
value, that is — between 1.00000000 and 1.00000012.
In type double
, there is no value between 1.00000000000000000
and 1.00000000000000022.
So let's go back to your question, asking whether there's "some way
to know which decimal values are represented in a precise or imprecise way."
If you look at ten decimal values between 1 and 2:
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
the answer is, only one of them is exactly representable in binary: 1.5.
If you break the interval down into 100 fractions, like this:
1.01
1.02
1.03
1.04
1.05
…
1.95
1.96
1.97
1.98
1.99
it turns out there are three fractions you can represent exactly:
.25, .50, and .75, corresponding to
¼, ½, and ¾.
If we looked at three-digit decimal fractions, there are at most
seven of them we can represent: .125, .250, .375, .500, .625, .750, and .875. These correspond to eighths, that is, ordinary
fractions with 8 in the denominator.
I said "at most seven" because it's not true (none of these
estimates are true) for all ranges of numbers. Remember,
precision is finite, and digits to the left of the decimal part
— that is, in the integral part of your numbers — count against
your precision budget, too. So it turns out that if you were to
look at the range, say, 4000000–4000001, and tried to subdivide
it, you would find that you could represent 4000000.25 and
4000000.50 as type float
, but not 4000000.125 or 4000000.375.
You can't really see it if you look at the decimal
representation, but what's happening inside is that type float
has exactly 24 binary bits of available precision, and the
integer part 4000000 uses up 22 of those bits, so you've only got
two bits left over for the fractional part, and with two bits you
can do halves and quarters, but not eighths.
You're probably noticing a pattern by now: the fractions we've
looked at so far that can be be represented exactly in binary
involve halves, quarters, and eights, and if we looked further,
this pattern would continue: sixteenths, thirty-seconds,
sixty-fourths, etc. And this should come as no real surprise:
just as in decimal the "exact" fractions involve tenths,
hundredths, thousandths, etc.; when we move to binary (base 2) the fractions
all involve powers of two. ½ in binary is 0b0.1
.
¼ and ¾ are 0b0.01
and 0b0.11
.
⅜ and ⅝ are 0b0.011
and 0b0.101
.
What about a fraction like 1/3? You can't represent it exactly
in binary, but since you can't represent it in decimal, either,
this doesn't tend to bother us too much. In decimal it's the
infinitely repeating fraction 0.333333…, and in binary it's the
infinitely-repeating fraction 0b0.0101010101…
.
But then we come to the humble fraction 1/10, or one tenth.
This obviously can be represented as a decimal fraction — 0.1 —
but it turns out that it cannot be represented exactly in binary.
In binary it's the infinitely-repeating fraction 0b0.0001100110011…
.
And this is why, as we saw above, you can't represent most of the other
"single digit" decimal fractions 0.2, 0.3, 0.4, …, either
(with the notable exception of 0.5), and you can't represent most
of the double-digit decimal fractions 0.01, 0.02, 0.03, …,
or most of the triple-digit decimal fractions, etc.
So returning once more to your question of which decimal
fractions can be represented exactly, we can say:
- For single-digit fractions 0.1, 0.2, 0.3, …, we can exactly represent .5, and to be charitable we can say that we can also represent .0, so that's two out of ten, or 20%.
- For double-digit fractions 0.01, 0.02, 0.03, …, we can exactly represent .00, 0.25, 0.50, and 0.75, so that's four out of a hundred, or 4%.
- For three-digit fractions 0.001, 0.002, 0.003, …, we can exactly represent the eight fractions involving eighths, so that's 8/1000 = 0.8%.
So while there are some decimal fractions we can represent
exactly, there aren't very many, and the percentage seems to be
going down as we add more digits. :-(
The fact — and depending on your point of view it's either an
unfortunate fact or a sad fact or a perfectly normal fact —
is that most decimal fractions can not be represented exactly
in binary and so can not be represented exactly using computer
floating point.
The numbers that can be represented exactly using computer
floating point, although they can all be exactly converted into
numerically equivalent decimal fractions, end up converting to
rather weird-looking numbers for the most part, with lots of digits, as we saw above.
(In fact, for type float
, which internally has 24 bits of
significance, the exact decimal conversions end up having up to
24 decimal digits. And the fractions always end in 5.)
One last point concerns the spacing between these "contiguous",
exactly-representable binary fractions. In the examples I've
shown, why is there tighter spacing for numbers less than 1.0
than for numbers greater than 1.0?
The answer lies in an earlier statement that "precision is
finite, and digits to the left of the decimal part count against
your precision budget, too". Switching to decimal fractions for
a moment, if I told you you had exactly 7 significant decimal
digits to work with, you could represent
1234.567
1234.568
1234.569
and
12345.67
12345.68
12345.69
but you could not represent
12345.678
because that would require 8 significant digits.
Stated another way, for numbers between 1000 and 10000 you can
have three more digits after the decimal point, but for numbers
from 10000 to 100000 you can only have two. Mathematicians call
these intervals like 1000-10000 and 10000-100000 decades,
and within each decade, all the numbers have the same number of
fractional digits for a given precision, and the same exponents:
1.000000×103 – 1.999999×103,
1.000000×104 – 1.999999×104, etc.
(This usage is rather different than ordinary usage, in which the
word "decade" refers to a period of 10 years.)
But for binary floating point, once again, the intervals of
interest involve powers of 2, not 10. (In binary, some computer
scientists call these intervals binades, by analogy with "decades".)
The interesting intervals are from 1 to 2, 2–4, 4–8, 8–16, etc.
For numbers between 1 and 2, you've got 1 bit to the left of the
decimal point (really the "binary point"), so in single precision
you've got 23 bits left over to use for the fractional part to the right.
But for numbers between 2 and 4, you've got 2 bits to the left,
so you've only got 22 bits to use for the fraction.
This works in the other direction, too: for numbers between
½ and 1, you don't need any bits to the left of the binary
point, so you can use all 24 for the fraction to the right.
(Below ½ it gets even more interesting). So that's why we
saw twice the precision (numbers half the size in the "delta"
column) for numbers just below 1.0 than for numbers just above.
We'd see similar shifts in available precision when crossing all the other
powers of two: 2.0, 4.0, 8.0, …, and also ½, ¼,
⅛, etc.
This has been a rather long answer, longer than I had intended.
Thanks for reading.
Hopefully now you have a better appreciation for which numbers can be
exactly represented in binary floating point, and why most of them can't.