1

In my program I use numpy to get number's exponents, then I use the sum function to summarize them. I've noticed that summarizing those large numbers, with or without numpy, results in the largest parameter being returned, unchanged.

exp_joint_probabilities=[  1.57171938e+81,   1.60451506e+56,   1.00000000e+00]
exp_joint_probabilities.sum()
=> 1.571719381352921e+81

The same with just python:

(1.57171938e+81+1.60451506e+56+1.00000000e+00)==1.57171938e+81
=>True

Is this a problem with approximation? Should I use a larger datatype to represent the numbers? How can I get a more accurate result for these kind of calculations?

shepp
  • 13
  • 1
  • 3

3 Answers3

2

You could use the decimal standard library:

from decimal import Decimal

a = Decimal(1.57171938e+81)
b = Decimal(1.60451506e+56)
d = a + b
print(d)
print(d > a and d > b)

Output:

1.571719379999999945626903708E+81
True

You could convert it back to a float afterwards, but this will cause the same problem as before.

f = float(d)
print(f)
print(f > a and f > b)

Output:

1.57171938e+81
False

Note that if you store Decimals in your numpy arrays, you will lose fast vectorized operations, as numpy does not recognize Decimal objects. Though it does work:

import numpy as np

a = np.array([1.57171938e+81, 1.60451506e+56, 1.00000000e+00])
d = np.vectorize(Decimal)(a)  # convert values to Decimal
print(d.sum())
print(d.sum() > d[0]

Output:

1.571719379999999945626903708E+81
True
  • 1
    You are still first converting the numbers to the 64bit float approximations and then those approximate values to Decimal. To directly convert the input to Decimal give them as strings, `a = Decimal("1.57171938e+81")` and `b = Decimal("1.60451506e+56")` then `a+b` gives the more correct result `Decimal('1.571719380000000000000000160E+81')` – Lutz Lehmann Jul 25 '17 at 08:11
1

1.57171938e+81 is a number with 81 digits, of which you only enter the first 9. 1.60451506e+56 is a much much much smaller number, with only 56 digits.

What kind of answer are you expecting? The first utterly dwarfs the second. If you want something of a similar precision to your original numbers (and that's what you get using floats), then the answer is simply correct.

You could use ints:

>>> a = int(1.57171938e+81)
>>> b = int(1.60451506e+56)
>>> a
571719379999999945626903548020224083024251666384876684446269499489505292916359168L
>>> b
160451506000000001855754747064077065047170486040598151168L
>>> a+b
1571719379999999945626903708471730083024253522139623748523334546659991333514510336L

But how useful that is is up to you.

RemcoGerlich
  • 30,470
  • 6
  • 61
  • 79
0

It does seem to be a problem with approximation:

>>> 1.57171938e+81 + 1.60451506e+65 > 1.57171938e+81
<<< True

>>> 1.57171938e+81 + 1.60451506e+64 > 1.57171938e+81
<<< False

You can get arount this by casting to int:

>>> int(1.57171938e+81) + int(1.60451506e+64) > int(1.57171938e+81)
<<< True
iCart
  • 2,179
  • 3
  • 27
  • 36