How to define and use numbers smaller than 2e-308

Question

Smallest double is 2.22507e-308. Is there any way I can use smaller numbers? I found a library called gmp, but have no idea how to use it, documentation is not clear at all, and I'm not sure if it works on windows.

I don't expect to give me instructions, but maybe at least some piece of advice.

A 'double' is a type that the CPU handles perfectly, in the way that CPU 'float' (as opposed to integer) registers work normally with 8 bytes at once. Can you give a reason for your "smaller numbers" necessity? — Ripi2, Dec 05 '16 at 16:00
I need smaller numbers to calculate exact gamma function values of high frequency complex argument. — user1956641, Dec 05 '16 at 16:03
***I'm not sure if it works on windows.*** It does. ***GMP's main target platforms are Unix-type systems, such as GNU/Linux, Solaris, HP-UX, Mac OS X/Darwin, BSD, AIX, etc. It also is known to work on Windows in both 32-bit and 64-bit mode.*** — drescherjm, Dec 05 '16 at 16:08
http://stackoverflow.com/questions/1017058/building-gmp-library-with-visual-studio — drescherjm, Dec 05 '16 at 16:11

score 1 · Answer 1 · answered Dec 05 '16 at 16:02

If you need really big precision, then give gmp chance. I am sure it works on Windows too.

If you just need bigger precision than double, try long double. It may or may not give you more, depends on your compiler and target platform.

In my case it does give more (gcc 6, x86_64 linux):

Test program:

#include <iostream>
#include <limits>

int main() {
    std::cout << "float:"
        << " bytes=" << sizeof(float)
        << " min=" << std::numeric_limits<float>::min()
        << std::endl;

    std::cout << "double:"
        << " bytes=" << sizeof(double)
        << " min=" << std::numeric_limits<double>::min()
        << std::endl;

    std::cout << "long double:"
        << " bytes=" << sizeof(long double)
        << " min=" << std::numeric_limits<long double>::min()
        << std::endl;
}

Output:

float: bytes=4 min=1.17549e-38
double: bytes=8 min=2.22507e-308
long double: bytes=16 min=3.3621e-4932

score 1 · Answer 2 · answered Dec 05 '16 at 16:04

If your compiler/architecture allows it, you could use something like long double, which compiles to an 80-bit float (though I think it aligns to 128 bits, so there's a bit of wasted space) and has more range and precision than a typical double value. Not all compilers will do that though, and on many compilers, long double is equivalent to a double, at 64-bits.

"gmp" is one library you could use for extended precision floats. I generally recommend boost.multiprecision, which includes gmp, though personally, I'd use cpp_bin_float or cpp_dec_float for my multiprecision needs (the former is IEEE756 compliant, the latter isn't)

As for how to use them: I haven't used gmp, so I can't comment on its syntax, but cpp_bin_float is pretty easy to use:

typedef boost::multiprecision::cpp_bin_float_quad quad;
quad a = 34;
quad b = 17.95467;
b += a;
for(int i = 0; i < 10; i++) {
    b *= b;
}
std::cout << "This might be rather big: " << b << std::endl;

score 1 · Answer 3 · answered Dec 05 '16 at 16:04

1

If you change your compiler to gcc or Intel type long double will be supported with bigger precission (80-bit). With default visual studio compiler, I have no advice for you what to do.

answered Dec 05 '16 at 16:04

How to define and use numbers smaller than 2e-308

3 Answers3