0

Sorry for my english. Can you tell me the smallest double type number after which the computer considers that the double type number equals zero?

Paul Floyd
  • 5,530
  • 5
  • 29
  • 43
NDGO
  • 51
  • 1
  • 1
  • 8

4 Answers4

4

Actual zero is zero. The result can become zero in different ways. A double has an value range of +/-10^+/-308 (roughly). A number smaller than the smallest number will be considered zero. Using #include <limits>, you can get numeric_limits<double>::denorm_min(), which is the smallest value that can be represented in a double.

But you can get "the effect of zero" in other ways. Say you have a fairly large number, 10 million, and you add (or subtract - read add as add or subtract in the rest of this paragraph) a very small number, say 1/10 million, then the addition will have no effect, because it is outside the actual value bits of the mantissa of the floating point number - that is, 53 bits in the case of double - then the effect will be the same as adding zero. In other words, even if you have a number that is not zero, using it to add to another number is not always going to change the other number.

See IEEE-754 on Wikipedia (other floating point formats do exist, but they are unusual).

Mats Petersson
  • 126,704
  • 14
  • 140
  • 227
  • More precisely `numeric_limits::min()`. – TrueY May 18 '13 at 06:50
  • 1) `denorm_min` can be smaller than `min`, 2) rounding modes can cause arbitrarily small values to be rounded to denorm_min and not zero. – Marc Glisse May 18 '13 at 07:31
  • 1
    -1, until you change that `numeric_limits::min()` to `numeric_limits::denorm_min()`. `numeric_limits::min()` is the smallest normalized number that can be represented as a double. Values smaller than that are possible. It's `numeric_limits::denorm_min()` that is the smallest value that can be represented as a double. – David Hammen May 18 '13 at 12:12
2

You could try:

#include <limits>
std::numeric_limits<double>::denorm_min();

Doc for denormal (aka subnormal) numbers (here).

If this number is divided by e.g. by 2 the result is 0.

To check this values on a specific platform the following code can be used:

#include <iostream>
#include <limits>
using std::cout;
using std::endl;

int main() {
    typedef double real;
    union dbl {
        real d;
        unsigned char c[sizeof(d)];

        dbl(const dbl &n = 0.0) : d(n.d) {}
        dbl(double n) : d(n) {}

        void pr(const char *txt = 0) const {
            if (txt) cout << txt << ": ";
            cout << d << ":";
            for (int i = sizeof(d) -1; i >= 0; --i)
                cout << std::hex << " " << (int)c[i];
            cout << endl;
        }
    };

    dbl n = 1.0;
    for (; n.d > 0.0; n.d /= 2.0)
        n.pr();
    n.pr("zero");
    n.d = std::numeric_limits<real>::min();
    n.pr("min");
    n.d = std::numeric_limits<real>::denorm_min();
    n.pr("denorm_min");
}

Output on 32 bit linux (intel cpu) (doc about double format):

1: 3f f0 0 0 0 0 0 0
0.5: 3f e0 0 0 0 0 0 0
0.25: 3f d0 0 0 0 0 0 0
0.125: 3f c0 0 0 0 0 0 0
0.0625: 3f b0 0 0 0 0 0 0
...
8.9003e-308: 0 30 0 0 0 0 0 0
4.45015e-308: 0 20 0 0 0 0 0 0
2.22507e-308: 0 10 0 0 0 0 0 0
1.11254e-308: 0 8 0 0 0 0 0 0
5.56268e-309: 0 4 0 0 0 0 0 0
...
7.90505e-323: 0 0 0 0 0 0 0 10
3.95253e-323: 0 0 0 0 0 0 0 8
1.97626e-323: 0 0 0 0 0 0 0 4
9.88131e-324: 0 0 0 0 0 0 0 2
4.94066e-324: 0 0 0 0 0 0 0 1
zero: 0: 0 0 0 0 0 0 0 0
min: 2.22507e-308: 0 10 0 0 0 0 0 0
denorm_min: 4.94066e-324: 0 0 0 0 0 0 0 1

If real is defined as long double the output is:

1: 0 0 3f ff 80 0 0 0 0 0 0 0
0.5: 0 0 3f fe 80 0 0 0 0 0 0 0
0.25: 0 0 3f fd 80 0 0 0 0 0 0 0
0.125: 0 0 3f fc 80 0 0 0 0 0 0 0
0.0625: 0 0 3f fb 80 0 0 0 0 0 0 0
...
5.83232e-4950: 0 0 0 0 0 0 0 0 0 0 0 10
2.91616e-4950: 0 0 0 0 0 0 0 0 0 0 0 8
1.45808e-4950: 0 0 0 0 0 0 0 0 0 0 0 4
7.2904e-4951: 0 0 0 0 0 0 0 0 0 0 0 2
3.6452e-4951: 0 0 0 0 0 0 0 0 0 0 0 1
zero: 0: 0 0 0 0 0 0 0 0 0 0 0 0
min: 3.3621e-4932: 0 0 0 1 80 0 0 0 0 0 0 0
denorm_min: 3.6452e-4951: 0 0 0 0 0 0 0 0 0 0 0 1

Or for float:

1: 3f 80 0 0
0.5: 3f 0 0 0
0.25: 3e 80 0 0
0.125: 3e 0 0 0
0.0625: 3d 80 0 0
...
2.24208e-44: 0 0 0 10
1.12104e-44: 0 0 0 8
5.60519e-45: 0 0 0 4
2.8026e-45: 0 0 0 2
1.4013e-45: 0 0 0 1
zero: 0: 0 0 0 0
min: 1.17549e-38: 0 80 0 0
denorm_min: 1.4013e-45: 0 0 0 1
TrueY
  • 7,360
  • 1
  • 41
  • 46
  • But this doesn't answer the question. – juanchopanza May 18 '13 at 07:09
  • @juanchopanza: a double with fraction less then denorm_min considered as zero. – TrueY May 18 '13 at 07:39
  • @TrueY and how do you get that number? – juanchopanza May 18 '13 at 07:41
  • @juanchopanza: Which number? The smallest positive double can be get using `double d = std::numeric_limits::denorm_min();`. I added some example to my answer. If You divided this number by e.g. 2, You would get zero. – TrueY May 18 '13 at 11:51
  • @NDGO: see the extended answer. – TrueY May 18 '13 at 12:13
  • Aside: If you use C I/O you can get a portable hexadecimal representation of a double precision number with the `"%a"` format. C++ does not offer that. – David Hammen May 18 '13 at 12:18
  • @DavidHammen: You are right, but the tag is C++, so I would like to use the C++ style. Otherwise I could use `float.h` and `DBL_MIN`. Anyway `DBL_MIN` seems to be equal to `std::numeric_limits::min()`, not `denorm_min()`. – TrueY May 18 '13 at 12:29
  • I was writing about how you generated that hex output, not the values such as `DBL_MIN`. For example, `printf ("%a", double_val)`. As far as I can see, there isn't any C++ flag that let you generate that `"%a"` format. `hex` applies to integers only, even in C++11. – David Hammen May 18 '13 at 12:47
  • @DavidHammen: Maybe I'm wrong, but in C++ I like to use the C++-ish solutions. I prefer to use `cout <<` instead of `printf` in C++ code. BTW I do not really like the `cout <<` syntax and I prefer `printf`. But this is a C++ example. :) – TrueY May 18 '13 at 14:01
  • C library functions are callable from C++. Arguably, the C I/O library *is* a part of C++, accessible via the C++ (not C) header ``. – David Hammen May 18 '13 at 15:14
  • @DavidHammen: You are absolutely right about that! As I said I just did not like to add C style code to a C++ example. Sorry for that! – TrueY May 18 '13 at 16:27
  • Try adding `#include ` and then in main before the loop: `fesetround(FE_UPWARD);` (you may then want to compile with `-frounding-math` if you use gcc or `-O0` with llvm). Now enjoy the infinite loop ;-) – Marc Glisse May 20 '13 at 20:18
  • @MarcGlisse:Thx for the comment! Is it still an infinite loop if it is divided by for example 10? – TrueY May 21 '13 at 09:15
  • How about trying it? (yes, the theoretical result is between 0 and denorm_min, and if the rounding is towards +infinity, that means denorm_min) – Marc Glisse May 21 '13 at 09:34
0

In the single-precision 32-bit and double-precision 64-bit format IEEE 754

The smallest positive normal value of double is 0x1.0p-1022 2.2250738585072014E-308.

The smallest positive denormal value of double is 0x0.0000000000001P-1022 4.9e-324.

The smallest positive normal value of float is 0x1.0p-126f 1.17549435E-38f.

The smallest positive denormal value of float is 0x0.000002P-126f 1.4e-45f.

Positive numbers smaller than above may result in 0, depending on the rounding-mode as Marc Glisse commented.

johnchen902
  • 9,531
  • 1
  • 27
  • 69
  • These values you state are platform dependent. You'd be better off taking them from the [](http://www.cplusplus.com/reference/limits/numeric_limits/) library. – Adrian May 18 '13 at 06:42
  • Depends on the rounding mode, arbitrarily small numbers may be rounded to non-zero if the rounding direction is "up" or "away from zero". – Marc Glisse May 18 '13 at 06:49
-1

When you compare a double value that has been calculated, you should never check equality. You should check to see if is within a range. Not doing so would lead to the strong possibility that what you think is true is not so.

This is possibly a duplicate of this question.

Community
  • 1
  • 1
Adrian
  • 10,246
  • 4
  • 44
  • 110