Is there any way to not lose the precision and still get the value?

Question

First off, I'm a total beginner at C, with prior experience of programming in Java and Python. The goal of the program was to add 2 numbers. While I was playing with the code, I encountered an issue with precision. The issue was caused when I added 2 numbers- 1 of float data type and the other of double data type.

Code:

#include <stdio.h>

int main() {
    double b=20.12345678;
    float c=30.1234f;
    printf("The Sum of %.8f and %.4f is= %.8f\n", b, c, b+c);
    return 0;
}

Output:

The Sum of 20.12345678 and 30.1234 is= 50.24685651

However, the correct output should be: 50.24685678

float values are accurate up-to 6 decimal places, and so is the output. I tried casting the values explicitly to double type, but its still of no use.

PS: When I convert the variable type from float to double, the output is precise; but is there any other way to add float and double integers without messing with their data type? Thank You.

Even with `double` you'll find imprecision. There are an infinite amount of numbers, yet your computer has finite resources. See [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) — yano, Oct 19 '21 at 14:21
I think this can be done with shifting bits and storing precision in separate variable — user786, Oct 19 '21 at 14:42
There are two things you have to remember. Not only do floats (and doubles) have limited precision, but they use binary internally, *not* decimal. So when you say `float c=30.1234f`, you do *not* get 30.1234000000 in `c`, where it's obvious that it cuts off cleanly after 6 digits. No, what you get is a binary number that cuts off cleanly after 24 *bits*. In binary that number is `0b11110.0001111110010111001`, and in hex it's `0x1E.1F972`. If you convert it to decimal it's equivalent to 30.1233997344970703125, which explains why when you added it to `b` it could change the last 678 to 651. — Steve Summit, Oct 19 '21 at 16:33

dbush · Answer 1 · 2021-10-19T14:37:38.713

2

The value assigned to c can't be expressed exactly so it gets assigned the next closest value. You don't see that when printing to 4 decimal places but you do see it if you print 8:

printf("The Sum of %.8f and %.8f is= %.8f\n", b, c, b+c);

Output:

The Sum of 20.12345678 and 30.12339973 is= 50.24685651

So the constant 30.1234f is already imprecise enough for the calculation you're trying to do.

edited Oct 19 '21 at 14:37

answered Oct 19 '21 at 13:55

dbush

205,898
23
218
273

score 2 · Accepted Answer · answered Oct 19 '21 at 14:52

2

float only guarantees 6 decimal digits of precision, so any computation with a float (even if the other operands are double, even if you're storing the result to a double) will only be precise to 6 digits.

If you need greater precision, then limit yourself to double or long double. If you need more than 10 decimal digits of precision, then you'll need to use something other than the native floating point types and library functions. You'll either need to roll your own, or use an arbitrary precision math library like GNU MP.

answered Oct 19 '21 at 14:52

John Bode

119,563
19
122
198

`float` “guarantees” six digits only in a very limited sense. This crude statement of it should not be taught, as it promotes misunderstanding. Notably, the conclusion derived from it where “so … will only be precise ot 6 digits” is incorrect. – Eric Postpischil Oct 20 '21 at 12:05

Is there any way to not lose the precision and still get the value?

2 Answers2