1

Consider the following numpy vector of number:

a = np.array([.1, .2, .2, .2, .2, .1])

Obviously, the sum of these numbers gives 1. However, when computing

b = np.sum(a)

I get

print (b)
0.9999999999999999

Could you anyone explain why and how to solve this approximation issue?

user1363251
  • 421
  • 1
  • 11
  • 24
  • It is nothing to do with approximations (well not really) - Python, C and many other languages use IEE754 Floating Point format, and being a limited precision format, none of the numbers that are in your array can be represented accurately as a binary value. – Tony Suffolk 66 Jul 18 '20 at 20:46

2 Answers2

1

This is due to the machine floating point accuracy. It is explained here in detail: https://docs.python.org/3/tutorial/floatingpoint.html

You can use the following to fix it:

b = round(np.sum(a),5)
print(b)
Reza
  • 1,945
  • 1
  • 9
  • 17
0

You can change precision choosing a different data type:

n = 1000

print(abs(1 - np.array([1 / n] * n).sum(dtype='float32')))
print(abs(1 - np.array([1 / n] * n).sum(dtype='float64')))
print(abs(1 - np.array([1 / n] * n).sum(dtype='float128')))

will produce:

1.1920928955078125e-07
4.440892098500626e-16
2.0816681711721685133e-17

NesteruS
  • 64
  • 3
  • 1
    This only happens to produce the desired result by sheer fluke. Using float32 actually makes rounding error much worse. – user2357112 Jul 18 '20 at 20:59