1

I am trying to generate a random number between -10 and 10 with step 0.3 (though I want to have these be arbitrary values) and am having issues with double precision floating point accuracy. Float.h's DBL_DIG is meant to be the minimum accuracy at which no rounding error occurs [EDIT: This is false, see Eric Postpischil's comment for a true definition of DBL_DIG], yet when printing to this many digits, I still see rounding error.

#include <stdio.h>
#include <float.h>
#include <stdlib.h>

int main()
{
  for (;;)
  {
    printf("%.*g\n", DBL_DIG, -10 + (rand() % (unsigned long)(20 / 0.3)) * 0.3);
  }
}

When I run this, I get this output:

8.3
-7
1.7
-6.1
-3.1
1.1
-3.4
-8.2
-9.1
-9.7
-7.6
-7.9
1.4
-2.5
-1.3
-8.8
2.6
6.2
3.8
-3.4
9.5
-7.6
-1.9
-0.0999999999999996
-2.2
5
3.2
2.9
-2.5
2.9
9.5
-4.6
6.2
0.799999999999999
-1.3
-7.3
-7.9

Of course, a simple solution would be to just #define DBL_DIG 14 but I feel that is wasting accuracy. Why is this happening and how do I prevent this happening? This is not a duplicate of Is floating point math broken? since I am asking about DBL_DIG, and how to find the minimum accuracy at which no error occurs.

E_net4
  • 27,810
  • 13
  • 101
  • 139
  • 1
    This statement is false: “ Float.h's `DBL_DIG` is meant to be the minimum accuracy at which no rounding error occurs…” – Eric Postpischil Jan 16 '21 at 11:15
  • 1
    @churill: That is not a correct answer to this question. Please do not promiscuously mark floating-point questions as duplicates of that one. It interferes with teaching people about specific issues of floating-point arithmetic. – Eric Postpischil Jan 16 '21 at 11:17
  • If ```DBL_DIG``` is not the minimum accuracy for double-precision operations, then what is? – StavromulaBeta Jan 16 '21 at 11:19
  • 2
    The definition of `DBL_DIG` is the maximum number of decimal digits for which it is guaranteed that converting a number with that many significant decimal digits to `double` and back to a decimal numeral with that many digits produces the original number. A consequence is that for more digits, the round-trip conversion may change the number. This guarantee holds only for two conversions making a “round trip.” It is not a guarantee that when you do other arithmetic operations, you will get the same result as you would with decimal arithmetic. Your program has several other operations. – Eric Postpischil Jan 16 '21 at 11:27
  • That makes sense. How many digits should I print to in order to avoid rounding error in most circumstances? (not just in my example above) – StavromulaBeta Jan 16 '21 at 11:30
  • 1
    @StavromulaBeta undecidable. For every operation the error accumulates. – Antti Haapala -- Слава Україні Jan 16 '21 at 11:46
  • 1
    For example if you add 0.0000001 in a loop million times to a much larger value, it is going to be different from adding 0.0000001 * 1000000 – Antti Haapala -- Слава Україні Jan 16 '21 at 11:46
  • That's a fair point, thank you. I will keep that in mind in future. – StavromulaBeta Jan 16 '21 at 11:48
  • 1
    Python 3 has a very nice floating point stringification algorithm that produces the **shortest** representation that produces the **exact same double representation** when converted to double. It is good for debugging these precision issues easily. For `-10 + (33 * 0.3)` Python 3 produces `-0.09999999999999964` – Antti Haapala -- Слава Україні Jan 16 '21 at 11:53
  • 1
    @AnttiHaapala: Is that a Python 3 specification or just part of common implementations? – Eric Postpischil Jan 16 '21 at 11:55
  • 1
    @EricPostpischil good question, I'll check. Just saying that CPython is a good tool for debugging floating point issues.+ – Antti Haapala -- Слава Україні Jan 16 '21 at 11:56
  • 1
    @StavromulaBeta there are even *several* double values *between* the value closest to `-0.1` *and* `-0.09999999999999964`... – Antti Haapala -- Слава Україні Jan 16 '21 at 11:57
  • 1
    @EricPostpischil indeed it is not documented, so unfortunately a CPython implementation detail. – Antti Haapala -- Слава Україні Jan 16 '21 at 12:03
  • That algorithm certainly looks very interesting. I will look into it. – StavromulaBeta Jan 16 '21 at 12:17
  • I think that I've found the algorithm here: https://github.com/python/cpython/blob/3.8/Python/dtoa.c, but a possibly better one is here: https://github.com/ulfjack/ryu – StavromulaBeta Jan 16 '21 at 12:29
  • 1
    Note that these algorithms are the subject of academic papers and development over many years. It might be challenging to attempt to understand them by reading the code. While conversion between bases is easy using methods taught in elementary school using long division and such, this requires memory and is inefficient in processor time. Implementing an efficient algorithm requires fairly involved proofs about the maximum numbers of digits needed at various points and other work. So it is hard to understand all that went into code behind the scenes. – Eric Postpischil Jan 16 '21 at 12:35
  • @EricPostpischil I agree. It is clear that these algorithms took a lot of work, and I certainly can't understand how they do it. Ryu also looks incredibly efficient. – StavromulaBeta Jan 16 '21 at 12:39

2 Answers2

2

For the specific code in the question, we can avoid excess rounding errors by using integer values until the last moment:

printf("%.*g\n", DBL_DIG,
    (-100 + rand() % (unsigned long)(20 / 0.3) * 3.) / 10.);

This was obtained by multiplying each term in the original expression by 10 (−10 because −100 and .3 becomes 3) and then dividing the whole expression by 10. So all values we care about in the numerator1 are integers, which floating-point represents exactly (within range of its precision).

Since the integer values will be computed exactly, there will be just a single rounding error, in the final division by 10, and the result will be the double closest to the desired value.

How many digits should I print to in order to avoid rounding error in most circumstances? (not just in my example above)

Just using more digits is not a solution for general cases. One approach for avoiding error in most cases is to learn about floating-point formats and arithmetic in considerable detail and then write code thoughtfully and meticulously. This approach is generally good but not always successful as it is usually implemented by humans, who continue to make mistakes in spite of all efforts to the contrary.

Footnote

1 Considering (unsigned long)(20 / 0.3) is a longer discussion involving intent and generalization to other values and cases.

Eric Postpischil
  • 195,579
  • 13
  • 168
  • 312
  • This works great in this scenario, but I would like the program to work with arbitrary values for the bounds and step size. – StavromulaBeta Jan 16 '21 at 12:18
1

generate a random number between -10 and 10 with step 0.3
I would like the program to work with arbitrary values for the bounds and step size.
Why is this happening ....

The source of trouble is assuming that typcial real numbers (such as string "0.3") can encode exactly as a double.

A double can encode about 264 different values exactly. 0.3 is not one of them.

Instead the nearest double is used. The exact value and 2 nearest are listed below:

0.29999999999999993338661852249060757458209991455078125
0.299999999999999988897769753748434595763683319091796875  (best 0.3)
0.3000000000000000444089209850062616169452667236328125

So OP's code is attempting "-10 and 10 with step 0.2999..." and printing out "-0.0999999999999996" and "0.799999999999999" is more correct than "-0.1" and "0.8".


.... how do I prevent this happening?

  1. Print with a more limited precision.

    // reduce the _bit_ output precision by about the root of steps
    #define LOG10_2 0.30102999566398119521373889472449
    int digits_less = lround(sqrt(20 / 0.3) * LOG10_2);
    for (int i = 0; i < 100; i++) {
      printf("%.*e\n", DBL_DIG - digits_less,
          -10 + (rand() % (unsigned long) (20 / 0.3)) * 0.3);
    }
    
    9.5000000000000e+00
    -3.7000000000000e+00
    8.6000000000000e+00
    5.9000000000000e+00
    ...
    -1.0000000000000e-01
    8.0000000000000e-01
    

OP's code really is not doings "steps" as that hints toward a loop with a step of 0.3. The above digits_less is based on repetitive "steps", otherwise OP's equation warrants about 1 decimal digit reduction. The best reduction in precisions depends on estimating the potential cumulative error of all calculations from "0.3" conversion --> double 0.3 (1/2 bit), division (1/2 bit), multiplication (1/2 bit) and addition (more complicated bit).

  1. Wait for the next version of C which may support decimal floating point.
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • This is a great answer. The explanation of cumulative rounding error is really helpful, and I will definitely remember it when writing code in future. – StavromulaBeta Jan 16 '21 at 21:54