-1

I've read several questions that answer how this work, but not how to get around it. My program is scanning 10 floats using scanf and printing out the sum, min, max, and product of all the floats.

For example, given the input:

1.45 -2e2 -2e-2 14 -10.0 0.01 -0.02 20 -3e1 +4e+0

The program should produce:

Sum is -200.58000
Min is -200.00000
Max is 20.00000
Product is -389.76000

but instead it produces:

Sum is -200.58000
Min is -200.00000
Max is 20.00000
Product is -389.75999

I understand why this happens, but can anyone tell me how to get around this?

Here's my code:

#include <stdio.h>

int main() {
        double min, max, sum, product;

        printf("Sum is %.5f \n", sum);
        printf("Min is %.5f \n", min);
        printf("Max is %.5f \n", max);
        printf("Product is %.5f \n", product);
}
Jack A
  • 1,265
  • 2
  • 7
  • 16
  • 2
    You can use `double` instead of `float` or you can show less digits. – mch Sep 05 '19 at 15:38
  • [Kahan summation](https://en.wikipedia.org/wiki/Kahan_summation_algorithm) helps for addition. – Shawn Sep 05 '19 at 15:38
  • If there's no obvious reason why you need to use float - and from what you describe there's none - then use `int32_t` instead. Note that "I need to print a decimal comma" is _not_ a valid reason to use floating point. – Lundin Sep 05 '19 at 15:41
  • 1
    I am unable to reproduce your result. Using `float` outputs -389.75998, and using `double` outputs -389.76000. So how did you get that result (see [mcve])? And what compiler are you using? – user3386109 Sep 05 '19 at 15:45
  • 5
    See [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – Jonathan Leffler Sep 05 '19 at 16:01
  • Possible duplicate of [Is floating point math broken?](https://stackoverflow.com/questions/588004/is-floating-point-math-broken) – giusti Sep 05 '19 at 16:15
  • @user3386109 The sum/min/max/product values are all doubles in my program. All of the scanned numbers are floats but they are put into an array of doubles. I looped through all of the scanned numbers, printing them out on each loop and they were all fine. Whenever they were multiplied or summed was when these small numbers came about. I am using gcc. – Jack A Sep 05 '19 at 16:33
  • @JackA Don't describe the code, post the code. – user3386109 Sep 05 '19 at 16:37
  • @user3386109 ok i updated the main post with my code. thanks for your replies! – Jack A Sep 05 '19 at 16:42
  • Change `float scan1, scan2, ` to `double scan1, scan2, ` and in the `scanf` change `%f` to `%lf`. – user3386109 Sep 05 '19 at 16:49
  • @user3386109 thanks man that worked, i guess i need to read over the docs for scanf again – Jack A Sep 05 '19 at 16:52
  • 2
    @JackA You're welcome. Keep in mind that even double precision math is not exact. However, a `double` has a about 16 digits of resolution, whereas a `float` only has 7. So avoiding `float` variables is a good first step. – user3386109 Sep 05 '19 at 17:01

3 Answers3

1

The question can't be answered without clarifying what "get around this" means. If you just want the example case to round the way you've shown as expected, though, use double instead of float. (In general, don't use float at all except for bulk storage of samples where total size will become a limiting factor; it's a pathological type.)

R.. GitHub STOP HELPING ICE
  • 208,859
  • 35
  • 376
  • 711
1

If there are no constraints on the magnitude of the floats, and there are 10 of them to be considered, then this is a difficult problem to solve in full generality.

It would be tempting to provide you with a specific solution for your case, especially given that both the product and the sum are within the order of magnitude of the set of inputs, but I imagine that's coincidental. If it's not then reading the data as double, doing the mathematics in double precision, and converting to float at the end will almost certainly work.

Given that the bottleneck will almost certainly be in the input / output rather than the numerics, if you need an exact answer, and want an easy life achieving that, then use a decimal type for your numbers. There are good arbitrary precision decimal libraries out there for C.

Bathsheba
  • 231,907
  • 34
  • 361
  • 483
1

The reason you've found so many explanations of this problem, but not so many solutions to how to "fix it", is that under one set of assumptions, it's not broken, so there's no way to fix it. Binary floating-point arithmetic is not exact and is not intended to be, and more importantly cannot exactly represent decimal fractions and is not intended to. There is no such binary floating-point number as 389.76; the closest you can get is something like 389.75999, so a program that prints 389.75999 is not wrong.

Now, I realize that's not the answer you want. In decimal floating-point, and in pure mathematics, the number 389.76 does exist, and that's the result you want. So if binary computer floating-point arithmetic can't give you this result, what can you do? There are several answers:

  1. Round off while (or before) printing. (The problem then becomes, how many places to round to?)
  2. Use double instead of float. (In the end, double has all the same problems -- it's still binary floating-point -- but its increased precision often makes the difference between not-working and working.)
  3. Roll your own fixed point. Do all your arithmetic using integers, but with the convention that they're multiplied by 100, or 1000. (If you're working with cash, this is stated as "use cents, not dollars.")
  4. Use a decimal arithmetic library. (I've never used such a thing, but I think there's one here.)
  5. Use a library that can do some level of symbolic manipulation to implement "pure mathematics": that can represent not only decimal numbers like 389.76, but also rational or irrational numbers like 1/3 or sqrt(2), all as "themselves", with perfectly mathematically accurate results.
Steve Summit
  • 45,437
  • 7
  • 70
  • 103