7

I am trying to compare two different lists to see if they are equal, and was going to remove NaNs, only to discover that my list comparisons still work, despite NaN == NaN -> False.

Could someone explain why the following evaluate True or False, as I am finding this behavior unexpected. Thanks,

I have read the following which don't seem to resolve the issue:

(Python 2.7.3, numpy-1.9.2)

I have marked surprising evaluations with a * at the end

>>> nan = np.nan
>>> [1,2,3]==[3]
False
>>> [1,2,3]==[1,2,3]
True
>>> [1,2,nan]==[1,2,nan]
True ***
>>> nan == nan
False
>>> [nan] == [nan]
True ***
>>> [nan, nan] == [nan for i in range(2)]
True ***
>>> [nan, nan] == [float(nan) for i in range(2)]
True ***
>>> float(nan) is (float(nan) + 1)
False
>>> float(nan) is float(nan)
True ***
Community
  • 1
  • 1
oliversm
  • 1,771
  • 4
  • 22
  • 44
  • 2
    This is entirely explain in the first post you linked - When you test equality on two lists, identity is tested before equality and `nan is nan` is `True` because `nan` and `nan` are the same object. – Holt Aug 26 '16 at 12:27
  • @Holt but how is identity defined in this case since there is no declaration like `alist = [nan]`? does this happen in memory somehow? – Ma0 Aug 26 '16 at 12:30
  • @Holt, In that case I find the following surprising as I thought `float` would create a new instance, e.g. I expected, `float(nan) == float(nan)` to be `False` but it is `True`, while `float(nan) is (float(nan) + 1) --> False`. – oliversm Aug 26 '16 at 12:32
  • 2
    @oliversm `nan` is an object in memory, so whenever you use it it always refers to the same object. When you do `float(nan)`, nothing happens because `nan` is already a `float`, whereas when you do `float('nan')` or `float(nan) + 1`, a new object is created, so the identity equality does not hold anymore. – Holt Aug 26 '16 at 12:33
  • @oliversm Replace `nan = np.nan` by `nan = float('nan')` - You have now an object which name is `nan` which is a `float('nan')`, per the standard, `float('nan') == float('nan')` is `False`, but `nan is nan` is obviously `True` since `nan` and `nan` are the same object! – Holt Aug 26 '16 at 12:36

1 Answers1

2

To understand what happens here, simply replace nan = np.nan by foo = float('nan'), you will get exactly the same result, why?

>>> foo = float('nan')
>>> foo is foo # This is obviously True! 
True
>>> foo == foo # This is False per the standard (nan != nan).
False
>>> bar = float('nan') # foo and bar are two different objects.
>>> foo is bar
False
>>> foo is float(foo) # "Tricky", but float(x) is x if type(x) == float.
True

Now think that numpy.nan is just a variable name that holds a float('nan').

Now why [nan] == [nan] is simply because list comparison first test identity equality between items before equality for value, think of it as:

def equals(l1, l2):
    for u, v in zip(l1, l2):
        if u is not v and u != v:
            return False
    return True
Holt
  • 36,600
  • 7
  • 92
  • 139
  • Why wouldn't you then expect `float(nan) is float(nan)` to be `False`? As in my console it evaluates to `True`. – oliversm Aug 26 '16 at 12:42
  • 1
    @oliversm See the last statement in the first block of codes - `float(a) is a` evaluates to `True` if `type(a) == float`. – Holt Aug 26 '16 at 12:43