2

Why does this fail? I create an array, create a new variable with that array minus a value from within the array, and then compare the array to a value that appears to be in the array. So why does the equality test fail?

import numpy as np
import platform
print platform.python_version()
print np.__version__ 
x = np.arange( -1,1,0.1 )
new_x = x - x[5]
print new_x
print new_x == -0.2

outputs:

2.7.9
1.9.2
[-0.5 -0.4 -0.3 -0.2 -0.1  0.  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.   1.1  1.2  1.3  1.4]
[False False False False False False False False False False False False False False False False False False False False]

EDIT: Using np.round() causes the comparison to behave as expected; the question now is, why am I being presented with rounded numbers when I print the array? In my experience python will usually print scientific notation or just a bunch of decimal places when the numbers are not exact.

Shockingly, I have been programming in python scientifically for 6 years and never seen this! It feels like a noob question but I really don't understand why what's printed is rounded.

Brian Hayden
  • 359
  • 2
  • 10
  • I just tested, and np.linspace() behaves the same way. I am assuming this is a precision thing but it would be nice to know exactly why it's happening. – Brian Hayden Jul 05 '15 at 00:52
  • 1
    Try `print np.round(new_x, decimals=1) == -0.2` – Scott Jul 05 '15 at 00:54
  • Thanks Scott, that works. In the past I've always noticed that numbers that are not exact are printed in scientific notation. So why am I getting rounded numbers printed but not in the comparison? That's really what I'm trying to understand. – Brian Hayden Jul 05 '15 at 00:56
  • 1
    I don't know the answer as to why, but it seems due to floating point precision. – Scott Jul 05 '15 at 00:56
  • Scott, definitely >>> x[12] yields 0.19999999999999973 >>> x[12] - 0.2 yields -2.7755575615628914e –  Jul 05 '15 at 01:02
  • You're welcome. I'm hoping someone will explain why; I'm also curious. – Scott Jul 05 '15 at 01:03
  • For me, printing x[12] in the script prints exactly 0.2 – Brian Hayden Jul 05 '15 at 01:06
  • 1
    It looks like by default, numpy will round to 8 digits when printing: http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html – Brian Hayden Jul 05 '15 at 01:07
  • This is interesting, try `np.set_printoptions(precision=25)` just below your import, http://docs.scipy.org/doc/numpy/reference/generated/numpy.set_printoptions.html. So yeah, it's most likely numpy's `ndarray.__str__`. – Scott Jul 05 '15 at 01:09
  • 1
    http://stackoverflow.com/questions/5595425/what-is-the-best-way-to-compare-floats-for-almost-equality-in-python has some usefule ideas... particularly np.isclose(x[12], 0.2, rtol=1e-05, atol=1e-08, equal_nan=False) a bit of overkill for two numbers but useful otherwise Scott ... do a search using "Floating point representation" to find out why with examples –  Jul 05 '15 at 01:19

1 Answers1

3

As you suspect, the strange behavior is due to small precision errors that occur during the floating point calculations. To see the differences, you can convert the floats into a hexadecimal form using float.hex():

>>> new_x[3].hex()
'-0x1.9999999999998p-3'
>>> (-0.2).hex()
'-0x1.999999999999ap-3'

Notice that they are, in fact, two different floating point numbers. The "rounding" only occurs while printing, and is in fact something that Python itself does by default. The basic idea is that there are two forms of string representations for Python objects (including floats): str and repr. Whereas repr should return an "authentic", reproducible string representation, str should optimize for human-readability - and part of that includes "rounding", to hide small precision errors. Notice:

>>> repr(new_x[3])
'-0.19999999999999996'
>>> repr(-0.2)
'-0.2'

For floats, you can simulate an equality check using numpy.isclose(), like so:

>>> print numpy.isclose(new_x, -0.2)
[False False False  True False False False False False False False False
 False False False False False False False False]
Community
  • 1
  • 1
voithos
  • 68,482
  • 12
  • 101
  • 116
  • When calling `repr(new_x)` each value within the numpy array is rounded. Seems inconsistent. Any idea why this was chosen over an array where each value is the full repr value? – Scott Jul 05 '15 at 02:51
  • @Scott: Hmm, that *is* odd. Unfortunately, I don't know why this is the case. You can always [submit a ticket](https://github.com/numpy/numpy/issues) about it. – voithos Jul 05 '15 at 03:10