Division of two floats giving incorrect answer

Question

Attempting to divide two floats in C, using the code below:

#include <stdio.h>
#include <math.h>

int main(){
  float fpfd = 122.88e6;
  float flo = 10e10;
  float int_part, frac_part;

  int_part = (int)(flo/fpfd);
  frac_part = (flo/fpfd) - int_part;

  printf("\nInt_Part = %f\n", int_part);
  printf("Frac_Part = %f\n", frac_part);

  return(0);
}

To this code, I use the commands:

>> gcc test_prog.c -o test_prog -lm
>> ./test_prog

I then get this output:

Int_Part = 813.000000
Frac_Part = 0.802063

Now, this Frac_part it seems is incorrect. I have tried the same equation on a calculator first and then in Wolfram Alpha and they both give me:

Frac_Part = 0.802083

Notice the number at the fifth decimal place is different.

This may seem insignificant to most, but for the calculations I am doing it is of paramount importance.

Can anyone explain to me why the C code is making this error?

That is not C, but a property of standard floating point types. What do you think can be represented in a fixed number of bits? Do some research on floating point (arithmetic). — too honest for this site, Aug 25 '15 at 13:18
Time for the good [What every programmer should know about floating point](http://floating-point-gui.de/). — RedX, Aug 25 '15 at 13:18
I don't think that is a good duplicate. This question is not about the general floating point precision problem. After all, using `double` over `float` still won't avoid the floating point precision problem, but it does give the expected result. — Yu Hao, Aug 25 '15 at 13:45

Chris Beck · Accepted Answer · 2015-08-25T15:04:15.257

When you have inadequate precision from floating point operations, the first most natural step is to just use floating point types of higher precision, e.g. use double instead of float. (As pointed out immediately in the other answers.)

Second, examine the different floating point operations and consider their precisions. The one that stands out to me as being a source of error is the method above of separating a float into integer part and fractional part, by simply casting to int and subtracting. This is not ideal, because, when you subtract the integer part from the original value, you are doing arithmetic where the three numbers involved (two inputs and result) have very different scales, and this will likely lead to precision loss.

I would suggest to use the C <math.h> function modf instead to split floating point numbers into integer and fractional part. http://www.techonthenet.com/c_language/standard_library_functions/math_h/modf.php

(In greater detail: When you do an operation like f - (int)f, the floating point addition procedure is going to see that two numbers of some given precision X are being added, and it's going to naturally assume that the result will also have precision X. Then it will perform the actual computation under that assumption, and finally reevaluate the precision of the result at the end. Because the initial prediction turned out not to be ideal, some low order bits are going to get lost.)

For a C post: rather than citing a C++ site and suggesting a function `modf()` from the C++ library `cmath`, recommending the C99 function `float modff(float value, float *iptr);` makes more sense — chux - Reinstate Monica, Aug 25 '15 at 14:57
that's a good point, edited the question. curiously this is what wikipedia references also: https://en.wikipedia.org/wiki/C_mathematical_functions — Chris Beck, Aug 25 '15 at 15:01
Your comment's ref IMO is better. The standard, of course, is "best" yet not easy to understand as derived places such as the en.cppreference reference. — chux - Reinstate Monica, Aug 25 '15 at 15:07

score 3 · Answer 2 · answered Aug 25 '15 at 13:21

Float are single precision for floating point, you should instead try to use double, the following code give me the right result:

#include <stdio.h>
#include <math.h>

int main(){
  double fpfd = 122.88e6;
  double flo = 10e10;
  double int_part, frac_part;

  int_part = (int)(flo/fpfd);
  frac_part = (flo/fpfd) - int_part;

  printf("\nInt_Part = %f\n", int_part);
  printf("Frac_Part = %f\n", frac_part);

  return(0);
}

Why ?

As I said, float are single precision floating point, they are smaller than double (in most architecture, sizeof(float) < sizeof(double)). By using double instead of float you will have more bit to store the mantissa and the exponent part of the number (see wikipedia).

Yu Hao · Answer 3 · 2015-08-25T13:34:53.087

2

float has only 6~9 significant digits, it's not precise enough for most uses in practice. Changing all float variables to double (which provides 15~17 significant digits) gives output:

Int_Part = 813.000000
Frac_Part = 0.802083

edited Aug 25 '15 at 13:34

answered Aug 25 '15 at 13:21

Yu Hao

119,891
44
235
294

"`float` has only 7 significant digits, it's not precise enough for most uses in practice." I mean, that's not really true, right? It's more than adequate for OpenGL, which is surely a large fraction of the practical uses. It's more like, it's typically inadequate for numerical experiments / scientific data analysis. – Chris Beck Aug 25 '15 at 13:33
@ChrisBeck 7 is a what `float` typically offer. I've edited to a more precise version. I still think *not precise enough for most used in practice* part is true, and arithmetic with `double` is usually faster thant `float` on modern machines. – Yu Hao Aug 25 '15 at 13:41
Note: C guarantees `float` has _at least_ 6 decimal digits of precision. `double` has _at least_ 10 decimal digits of precision. – chux - Reinstate Monica Aug 25 '15 at 15:02

Division of two floats giving incorrect answer

3 Answers3

Why ?