Why does the decimal 5.875 end up converting to 101.111 in floating point mathematics?

Question

Okay, so in studying Systems programming, I have hopped upon floating points. I'm kind of excited to master these, because it will be INSTRUMENTAL in learning other languages faster. The unfortunate part is: I am stuck, particularly with the decimal 5.875 and converting it to float.

The intuitive method I have already used is dividing every digit by 2(depending on where I am.) and so, I divide 5 by 2 I get remainder 1. I divide 8 by 2 get remainder 0. I divide 7 by 2 and get remainder 1. Finally, divide 5 by 2 and get remainder 1 for . . .Wait for it. . .

1.011

So, I check the online converter and the answer is actually 101.111. I'm not sure why that is, so I went searching google for the right math, and there are at least six different mathematical interpretations all tackling different decimal representations.

Clearly, my math is wrong: How do I properly convert decimals to binary and by extension floating point? I already know how Binary Representation translates to floating point representation, but I'm a little stuck on how Decimal Representation with floating points translate to Binary representation.

In short: 5 is encoded as 101 (4 + 0 + 1). .875 is 7/8, and is encoded as 7 * 0.001b, so as 0.111b (111b = 7, 0.001b = 1/8, b means binary). See also an answer of mine: https://stackoverflow.com/questions/6910115/how-to-represent-float-number-in-memory-in-c/6911412#6911412. That uses 5.2 to demonstrate how it works. — Rudy Velthuis, Oct 06 '18 at 21:33

score 3 · Accepted Answer · answered Oct 06 '18 at 13:56

101.111 as a binary numeral represents 1•2² + 0•2¹ + 1•2⁰ + 1•2⁻¹ + 1•2⁻² + 1•2⁻³ = 4 + 0 + 1 + ½ + ¼ + ⅛ = 5 + .5 + .25 + .125 = 5.875.

There is no floating-point here.

The process for converting a decimal numeral to binary is not to examine individual digits to see whether they are even and odd. Generally, one converts the integer portion (5) and the fraction portion (.875) separately. An algorithm for the integer portion is:

The lowest bit is 0 or 1 according to whether the number is even or odd.
Subtract the bit from the number, divide it by two, and repeat the above to find the second lowest bit, then the third, and so on.
Stop when the number is zero.

For example, with your number 5:

5 is odd, so the lowest bit is 1.
(5−1)/2 = 2, which is even, so the second lowest bit is 0.
(2−0)/2 = 1, which is odd, so the third lowest bit is 1.
(1−1)/2 = 0, so we stop. The result is 101.

An algorithm for fractions is:

Multiply the number by two. The highest bit is 0 or 1 according to whether the number is less than 1 or between 1 (inclusive) and two.
Subtract the bit and repeat the above to find the second highest bit, then the third, and so on.
Stop when the number is zero or you have as many bits as desired.

For example, with .875:

.875•2 = 1.75, which is 1 or greater, so the highest bit of the fraction is 1.
(1.75−1)•2 = 1.5, which is 1 or greater, so the next bit is 1.
(1.5−1)•2 = 1, which is 1 or greater, so the next bit is 1.
1−1 is 0, so we stop. The result is .111.

There are mathematical reasons these algorithms are correct, which could be elaborated if desired.

If you are looking at online converters, they may be showing how numbers are encoded in floating-point. Floating-point formats do not just convert a number to binary. They shift and manipulate the number in certain ways to encode it. Discussion of those encodings is separate from discussion of converting numbers to binary.

"Generally, one converts the integer portion (5) and the fraction portion (.875) separately. ". That is IIRC [what glibc does](https://www.exploringbinary.com/how-glibc-strtod-works/), but e.g. [David M. Gay's strtod() (dtoa.c)](https://www.exploringbinary.com/how-strtod-works-and-sometimes-doesnt/) doesn't, nor does [GCC](https://www.exploringbinary.com/how-gcc-converts-decimal-literals-to-floating-point/). — Rudy Velthuis, Oct 06 '18 at 21:20
FWIW, David M. Gay's strtod() is ported to other languages than C or C++ too, and is also used by e.g. Java, Python and PHP. — Rudy Velthuis, Oct 06 '18 at 21:28

Why does the decimal 5.875 end up converting to 101.111 in floating point mathematics?

1 Answers1