5

I am trying to put some numbers into numpy array

>>> np.array([20000001]).astype('float32')
array([ 20000000.], dtype=float32)

where did 1 go?

Bob
  • 5,809
  • 5
  • 36
  • 53
  • Possible duplicate of [Is floating point math broken?](http://stackoverflow.com/questions/588004/is-floating-point-math-broken) – NobodyNada Feb 15 '17 at 16:30

3 Answers3

8

You simply don't have enough precision. The float32 has only approximately 7 digits of accuracy, whereas the float64 has approximately 16 digits of accuracy. Thus, any time you convert to a float32, it's only guaranteed to be "correct" to within about a part in 10^7. So, for example, you can try this:

>>> np.array([20000001]).astype('float64')
array([ 20000001.])

That's the expected answer. (The dtype=float64 is automatically omitted, because that's the default.) In fact, you can go further and find

>>> np.array([2000000000000001]).astype('float64')[0]
2000000000000001.0

but

>>> np.array([20000000000000001]).astype('float64')[0]
20000000000000000.0

At some point, no matter how high your precision, you'll always get to the point where floats drop the least significant digits. See here for more info on floats.

On the other hand, python's int objects have many more digits they can keep track of. In python 3, it's practically unlimited. So ints have basically infinite precision. See here for more info on ints.

Community
  • 1
  • 1
Mike
  • 19,114
  • 12
  • 59
  • 91
2

with float32 you can't resolve it

>>> np.finfo(np.float32).eps
1.1920929e-07

eps here gives you "the smallest representable positive number such that 1 + eps != 1" which is a measure for float32 accuracy. Multiply that with 20,000,000 and it's just too large.

Less informally, if one wants to avoid computing the binary representation of n then eps * n / base is a convenient lower bound for the resolution around n. While as @hobbs points out eps * n is an upper bound.

Also note that for example 1 + 0.6*eps may actually return something != 1, this is, however, due to rounding. Subtracting 1 from the result returns eps, not 0.6*eps.

Paul Panzer
  • 51,835
  • 3
  • 54
  • 99
  • That's not exactly the right way to compute that... although it is within a factor of 2 of the correct value. The actual precision of a 32-bit float in the vicinity of 20,000,000 is exactly 2 (as it is for any value between 2^24 + 1 and 2^25). – hobbs Feb 15 '17 at 16:03
  • @hobbs well, unless I'm completely mistaken eps x n / base does give you a lower bound for the smallest number you can add to n such that etc. – Paul Panzer Feb 15 '17 at 16:16
  • upper, rather. The precision at 1.99 is the same as it is at 1, not 1.99 times worse; it jumps at 2. – hobbs Feb 15 '17 at 16:19
  • @hobbs and what use would an upper bound be here, hm? It is very simple: A lower bound for some y is a number that is guaranteed not to exceed y. If we want to show that 1 can't be effectively added to 20,000,000 we need to show that r > 1 where r is the rersolution at 20,000,000 if we have a lower bound b for r and b > 1 then we can conclude r > 1 because r >= b > 1. with an upper bound this does not work. – Paul Panzer Feb 15 '17 at 16:36
  • I don't disagree that an upper bound is not very useful; my point is that the value you provided *is* an upper bound, not a lower bound. – hobbs Feb 15 '17 at 17:06
  • (Luckiliy, the lower bound is simply half that.) – hobbs Feb 15 '17 at 17:07
  • @hobbs just to be sure you did see the 1 / base there? Or are we disagreeing on rounding? Something like 20,000,000 + 1.1 = 20,000,002? I'd say that doesn't count since you haven't added 1.1 you've added 2. If you check `np.float32(1) + np.float32(eps*0.6)` you'll also get something which is not 1, so the library agrees with me on that. – Paul Panzer Feb 15 '17 at 17:37
  • I missed the `/base`. Apologies. – hobbs Feb 15 '17 at 17:49
  • @hobbs no prob, sorry for getting tetchy so easily. I'll clarify the post. – Paul Panzer Feb 15 '17 at 17:51
2

First of all, float64 works in this case:

>>> np.array([20000001]).astype('float32')
array([ 20000000.], dtype=float32)
>>> np.array([20000001]).astype('float64')
array([ 20000001.])


How does a float work under the hood:

enter image description here


What's the difference between float32 and float64?:

  • 32bit (single precision float): 24 bit significand
  • 64bit (double precision float): 53 bit significand


With float32, you get 23 bits to represent the digits plus 1 bit to represent the sign. Lets view 20000001 in binary:

0b 1 0011 0001 0010 1101 0000 0001  ---->
0b 1 0011 0001 0010 1101 0000 00

So the last two bits "01" will get cut off when converting from int to float32.

Interestingly, converting 20000003 will get you 20000004:

>>> np.array([20000003]).astype('float32')
array([ 20000004.], dtype=float32)

And that is:

0b 1 0011 0001 0010 1101 0000 0011  ---->
0b 1 0011 0001 0010 1101 0000 01
greedy52
  • 1,345
  • 9
  • 8
  • 1
    Very nice explanation. Please allow me to point out a small inaccuracy. Even though there are 23 bits for the significand the actual precision is 24bits because of the implicit leading 1 which needn't be stored. You can check this with `np.float32(20_000_002)` which is represented exactly. – Paul Panzer Feb 15 '17 at 19:34