0

I am working on a calculation which is to convert double to binary, a strange problem happens during this process and finally leads to an error. So I print out the fractional part when I found result is wrong.

A piece of code for fractional part is like this:

        while(float_part != (int)(float_part)){
            float_part -= (int)(float_part); //just leave fractional part
            float_part *= 2; //float_part is a double
            res = res + to_string(((int)(float_part))); //add to "res", which is a string
            cout << float_part << "+" << length << "\n"; //to figure out why
            length--;  //the length is initialized to 32
            if(length <= 0){
                return "ERROR"; //if too long
            }
    }

Then I input "28187281.525"(only .525 matters in the above piece of code) and found the result is so weird:

    1.05+32
    0.1+31
    0.2+30
    0.4+29
    0.8+28
    1.6+27
    1.2+26
    0.4+25
    0.799999+24
    1.6+23
    1.2+22
    0.399994+21
    0.799988+20
    1.59998+19
    1.19995+18
    0.399902+17
    0.799805+16
    1.59961+15
    1.19922+14
    0.398438+13
    0.796875+12
    1.59375+11
    1.1875+10
    0.375+9
    0.75+8
    1.5+7
    1+6
    1101011100001101010010001.100001100110011001100110011

In the beginning it's okay, but eventually the result becomes wrong!

And why 0.4*2 become 0.799999..

Anyone know the reason? Thanks in advance!

Meena
  • 685
  • 8
  • 31
彭浩翔
  • 83
  • 2
  • 9

2 Answers2

2

Floating point values have a limited precision. Any operations you do on them can introduce small errors. The more operations you perform, the more the error increases. In your case, you should split your floating point variable into its integer components (sign, mantissa and exponent), and perform any operations on those integers. Floating points are normally stored in IEEE_754 format:

https://en.wikipedia.org/wiki/Floating_point#IEEE_754:_floating_point_in_modern_computers

G. Sliepen
  • 7,637
  • 1
  • 15
  • 31
  • So basically, convert the fractional part to a int or long is the only safe way? Is there any other tricks? – 彭浩翔 Oct 18 '16 at 10:43
  • The trick is to copy the floating point value into a suitably sized `int` (you can use a `union` for this or straight `memcpy()`), and then use bitwise operations and shifts to get the sign bit, exponent and mantissa. Once you have done this, things should be very easy, because basically you are just printing the mantissa in binary, and the only problem left is to put the decimal point in the right place. – G. Sliepen Oct 18 '16 at 11:22
  • thanks, I followed this way seems good! – 彭浩翔 Oct 18 '16 at 12:35
1

This is the nature of finite precision arithmetic when you manipulate values that can't be represented exactly.

0.4*2 becomes 0.7999999 for the same reason 1/3 times 3 becomes 0.9999999 -- the best you can do in decimal is represent 1/3 as 0.333333 and if you multiply that by 3, you get 0.99999. You would need an infinite number of digits to get the exact answer.

David Schwartz
  • 179,497
  • 17
  • 214
  • 278
  • Hey thanks man, so practically how to calculate a float or double? Convert fractional part to a int or long? It seems so troublesome.. – 彭浩翔 Oct 18 '16 at 10:45
  • It really depends on exactly what you're trying to do. But perhaps you're using the wrong tool for the job and should be using something else. For fractions, maybe ratios of integers. For floating point, maybe higher precision, maybe rounding, maybe floating decimal point. It depends on exactly what your requirements are. But you can't skip the step of identifying your requirements and choosing tools that can meet them. – David Schwartz Oct 18 '16 at 10:47