decimal number and exact representation

Question

In c++, how to know when a decimal number can be exactly representing using IEEE 754-1985 standard. for instance 0.2 cannot be representing exactly.

Is there a simple rule ?

What programming language are we talking about? Because in C# for instance a double or decimal value you can represent a value exactly. — Security Hound, Jan 30 '12 at 18:09
@Ramhound: no, you can't represent all values exactly. In C# `double` uses the IEEE rules referred here, and `decimal` isn't magical: it uses similar rules, but with base 10 instead of base 2. — R. Martinho Fernandes, Jan 30 '12 at 18:21
A simple rule-of-thumb: integers, and decimal fractions ending in 5. (Of course, there are also limits on the range and precision to consider, which is much less simple.) — Mike Seymour, Jan 30 '12 at 18:52

Kerrek SB · Accepted Answer · 2012-01-30T22:28:36.483

A number can be represented exactly as an IEEE755 float if it can be written as B × 2ⁿ, where B is an integer (and B and n fall into some valid range). In other words, there must be some integer n such that if you mutliply your number by 2ⁿ you get an integer. Clearly for 1/5 there is no such n.

Yet another way of saying it is that your number has to be the sum of finitely many powers of two which are not too far apart (the maximal distance between the powers is the precision of your float).

Yet another way to say it very loosely is that "rational numbers whose denominator is a power of two" are representably (though with the obvious precision constraints).

The precision of the float, which is the width of B, is 24 bits for single, 53 bits for double and 64 bits for extended double precision.

I would state "(and B and n fall into some valid range)" -- you've implied this, with your last paragraph, but the first paragraph is missing a requirement. — Jason S, Jan 30 '12 at 18:40

Mark Ransom · Answer 2 · 2012-01-30T18:40:41.470

5

Yes, there is a simple rule. If the fractional part cannot be represented by some number n over some power of 2, it cannot be exactly represented - it will repeat indefinitely. Otherwise it will be exact as long as the representation fits in the number of bits available.

So for example 0.75 works, because it is 3/4. There's no way to make 0.2 work, because it's 1/5 and there's nothing you can adjust it to that leaves the numerator a power of 2.

The reason so many decimal numbers can't be exactly represented is that decimal numbers have a combination of 2 and 5 in the denominator. You can only get an exact representation if you can simplify the fraction to remove all the 5's.

To give another example, consider 0.625. As a fraction it is 625/1000, but it simplifies down to 5/8. The simplified form has a power of 2 on the bottom, so it will be exact.

One interesting side effect is that all exactly representable decimals end in a "5". If it doesn't then you can tell very quickly that it can't be exact.

edited Jan 30 '12 at 18:40

answered Jan 30 '12 at 18:18

Mark Ransom

299,747
42
398
622

These are necessary conditions but not sufficient. For instance, the integer 2^53 can be represented exactly in a double-precision floating point, but the integer 2^53 + 1 cannot. – Jason S Jan 30 '12 at 18:24
@JasonS, true enough - you can easily start to run out of bits. But the question was about a "simple" rule and that's what I tried to give. – Mark Ransom Jan 30 '12 at 18:26
You would be correct if you used the converse of your first sentence (changed the phrase "can be" in your first sentence to "cannot be" and fixed the last clause). I don't have a problem with approximations, but the answer as stated needs to be fixed for correctness. – Jason S Jan 30 '12 at 18:34
@JasonS, another good point. I've made an edit. I'm not sure the double negative helps, even if it makes it more technically correct. – Mark Ransom Jan 30 '12 at 18:41

score 1 · Answer 3 · edited May 23 '17 at 12:00

Is there a simple rule ?

Empirical: convert and convert back, and if you get the same number, then it's exactly representable.

Theoretical: IEEE-754 uses a sign bit, M mantissa bits, and E exponent bits (M = 52, E = 11 for 64-bit double, M = 23, E = 8 for 32-bit float), represented as (+/- 1) * (1 + (m/2^M)) * (2^(e-(2^(E-1)-1))), for m = unsigned M-bit mantissa, e = unsigned E-bit exponent. If your number can be represented this way, then it's exactly representable. (There are also subnormal numbers for even smaller exponents that fit into the floating-point number bitspace)

The English translation of the above, is that if your number X can be written as either 0, or (+/- 1) * a nonnegative odd integer m' times a power of two 2^(e') then X might be representable; to be sure you have to check whether m' and e' fit in their respective bit spaces:

m' = 2k+1, k < 2^M

e' = an exponent range that I'm not sufficiently confident in my algebra to make sure I'm right. Anywhere within +/- 900 for double-precision numbers is fine, but if the exponent exceeds 900 in magnitude then you need to be careful, and really should look more closely at the way bits are actually represented.

Do you mean M = 23, E = 8 for 32-bit values? – R. Martinho Fernandes Jan 30 '12 at 18:25 — R. Martinho Fernandes, Jan 30 '12 at 18:25

decimal number and exact representation

3 Answers3

Linked