2

When i execute these 2 lines, i get 2 different results. - Why?

item variable is a type numpy.float32

print(item)
print(item * 1)

output:

0.0006
0.0006000000284984708

I suspect this is being related to the numpy.float32 type somehow?

If i try to convert the numpy.float32 to float i get this:

item = float(item)
print(item)

output:

0.0006000000284984708
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
skywalkerdk
  • 111
  • 1
  • 8
  • welcome to the wonderful world of floating point arithmetics. i dont think you are experiencing an issues, this is just how floating point works. indeed im getting these exact results on my end – Nullman Jul 22 '19 at 10:18
  • Is there any way to avoid this? It's causing me heaps of trouble down the road. – skywalkerdk Jul 22 '19 at 10:20
  • what kind of problems? you can try reading [this answer](https://stackoverflow.com/questions/24432648/strange-multiplication-result/24433718#24433718) to what seems to be a similar issue, also you can try [this](https://stackoverflow.com/a/25182820/7540911) to circumvent issues – Nullman Jul 22 '19 at 10:21
  • @jottbe you can adjust the **printed** precision with [np.set_printoptions](https://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html) – Nullman Jul 22 '19 at 10:33
  • I tried adjusting the printoptions, but it yields the same. – skywalkerdk Jul 22 '19 at 10:33
  • @gspr. This question is about an actual type conversion, so no – Mad Physicist Jul 22 '19 at 11:03
  • What is the dtype of `item * 1`? It seems like it's being widened to `float64`, which is weird, but explainable. What version of numpy are you using? – Mad Physicist Jul 22 '19 at 11:05
  • @Nullman: thank you for the info, so that's the reason why I didn't see the digits like in the post above, but of course the precision error is still present and could increase if the value is used for calcuations. – jottbe Jul 22 '19 at 11:08
  • @gspr: no it is not broken, that's a normal phenomenon. You can't represent arbitrary numbers with `float`. The system just tries to use the `float` representation which has the smallest diviation from the real value and usually that doen't harm. – jottbe Jul 22 '19 at 11:11
  • @jottbe: I'm well aware of that. Hence I linked to the canonical question/answer on that topic. The comment is how Stack Overflow displays such a notice of possible duplication. – gspr Jul 22 '19 at 11:17

4 Answers4

4

What you observe unfortunately is not avoidable. It has to do with the internal representation of a float number. In this case it doesn't even have to do with calculation issues, as suggested in comments here.

(Binary base) float numbers as used by most languages are represented as (+/- mantisse)*2^exponent. The important part here is the mantisse, that doesn't allow to represent all numbers exactly. The value range of the mantisse and the exponent depend on the bit length of the float you use. The exponent is responsible for the maximum and minimum representable numbers, while the mantisse is responsible for the precision of the displayable numers (loosely speaking the "granularity" of the numbers).

So for your question, the mantisse is more important. As said it is like a bit array. In a byte a bit has a value depending on it's position of 1, 2, 4, ... In the mantisse it is similar, but instead of 1, 2, 3, 4, the bits have the value 1/2, 1/4, 1/8, ...

So if you want to represent 0.75, the bits with the values 1/2 and 1/4 would be set in your mantisse and the exponent would be 0. That's it in very short. Now, if you would try to represent a value like 0.11 in a float represenation, you will notice, that it is not possible. No matter if you use float32 or float64:

import numpy as np
item=np.float64('0.11')
print('{0:2.60f}'.format(item))
output: 0.110000000000000000555111512312578270211815834045410156250000

item=np.float32('0.11')
print('{0:2.60f}'.format(item))
output: 0.109999999403953552246093750000000000000000000000000000000000

Btw. if you want to represent the value 0.25 (1/4) it is not that the bit for 1/4 is set, but instead the bit for 1/2 and the exponent is set to -1, so 1/2*2^(-1) is again 0.25. This is done in a normalization process.

But if you want to increase the precision you could use float64, as I did it in my example. It will reduce this phenomenon a bit.

It seems, that some systems also support decimal based floats. I haven't worked with them, but probably they would avoid this kind of problems (not the calculation issus though mentioned in the post someone else posted as an answer).

jottbe
  • 4,228
  • 1
  • 15
  • 31
  • 2
    note that the `decimal` module is great for getting the decimal expansion of floats, e.g. `print(decimal.Decimal(0.006))`, or with a numpy array `arr.astype(np.dtype(Decimal))` – Sam Mason Jul 22 '19 at 13:21
  • Thanks for the tip. I never used this. But if you calculate with monetary amounts, probably it is a good alternative. So he could also calculate with tge Decimal class, but then he should use `decimal.Decimal('0.006')` ranther than `decimal.Decimal(0.006)`, because in the last case the value is passed as a float and the damage is already done. – jottbe Jul 22 '19 at 13:29
  • except that native python floats are a couple of orders of magnitude faster than the decimal module, and numpy is even faster when you have a few thousand values to work with. floating point formats are almost always the right thing to use, they can just be a bit confusing at times! – Sam Mason Jul 22 '19 at 13:43
3

The reason you see two different results is that your variable item is in numpy.float32, as you said. Python internally uses 64 bit floating point numbers, so

print(item)

returns the (lower precision) result in 32 bit, while

print(item * 1)

first multiplies with 1, which is an integer. It is not possible to multiply integer with float, so Python converts both into floats - 64 bit floats, since you do not specify anything else. The result is then a 64 bit float.

If you would specify another type of "1",

print(item * numpy.float32(1))

returns the same result as print(item), because there is no type conversion and everything can stay in 32 bit.

StefanS
  • 1,740
  • 14
  • 20
1

You haven't specified exactly what the problem is, beyond "the numbers don't match". How you handle floating point depends a little on your application, but in general you can't rely on comparing floating point numbers exactly. With a few obvious exceptions: 0 times anything should be 0, 1 times anything should be 1 (there's more, but lets stop there). So why is 1*item different from item?

>>> item = np.float32(0.0006)
>>> item
0.0006
>>> item*1
0.0006000000284984708

Right, this seems to contradict common sense. No, it's just the wrong way. Do an actual comparison and everything is still alright with the world.

>>> item == item*1
True

The numbers are the same. This should make sense - increasing the precision of a floating point shouldn't change it's value, and multiplying by 1 should not change a number.

So, what's going on? Numpy converts an np.float32 value to a python float which prints with nice rounding. However, item*1 is an np.float64 which by default shows more siginificant figures. If you print both of these with the same amount of significant figures you can see there's no real difference.

>>> "{:0.015f}".format(item*1)
'0.000600000028498'

>>> "{:0.015f}".format(item)
'0.000600000028498'

So that's it. What python prints isn't meant to be a completely accurate representation of numbers. The other answers get into why 0.0006 can't be represented exactly.

Edit Rounding doesn't change this, it just converts item to a python float which prints with rounding.

>>>  "{:0.015f}".format(round(item, 4))
'0.000600000028498'
user2699
  • 2,927
  • 14
  • 31
0

I cannot seem to find the logic in this, but have made a workaround simply converting the numpy.float32 to float and rounding the numbers to a specific decimal.

skywalkerdk
  • 111
  • 1
  • 8
  • 1
    it's "because" a 32bit floating point number can't represent 0.006 exactly, the closest value is the one you see when you coerce to a 64bit float (i.e. by doing `float(item)`, as Python only has 64bit floats natively). many more details here: https://docs.python.org/3/tutorial/floatingpoint.html – Sam Mason Jul 22 '19 at 13:51