3

There is puzzling (at least for me) behavior of Python's set in combination with NaNs (here live):

>>> float('nan') in {float('nan')}    # example 1
False
>>> nan = float('nan')                # example 2
>>> nan in {nan}
True

At first, I wrongly assumed,that this is the behavior of the ==-operator, but this is obviously not the case because both cases yield False as expected (here live):

>>> float('nan') == float('nan') 
False
>>> nan = float('nan')
>>> nan == nan
False

I'm mainly interested in the causes for this behavior. But if there is a way to ensure consistent behavior, that would also be nice to know!

ead
  • 32,758
  • 6
  • 90
  • 153
  • 4
    Python's containers assume that `==` is an equivalence relation. If it's not, notions of containment fall apart. You shouldn't be putting NaN in a set in the first place. – user2357112 Jul 31 '18 at 21:07
  • 1
    Probably relevant: https://stackoverflow.com/questions/9089400/python-set-in-operator-uses-equality-or-identity – Oliver Charlesworth Jul 31 '18 at 21:07
  • 1
    See also [here](https://stackoverflow.com/questions/20320022/why-in-numpy-nan-nan-is-false-while-nan-in-nan-is-true). – DSM Jul 31 '18 at 21:16

1 Answers1

6

set membership does an identity check as a short-circuit before considering an equality check (CPython source is in setobject.c, see also the note below PyObject_RichCompareBool).

Python core devs are motivated by these invariants:

for a in container:
    assert a in container    # this should ALWAYS be true

Ergo:

assert a in [a]
assert a in (a,)
assert a in {a}

It was decided that ensuring these invariants was the most important priority, and as for NaN: oh well. Special cases aren't special enough to break the rules. For all the gory details, see bpo issue4296:

Python assumes identity implies equivalence; contradicts NaN.

wim
  • 338,267
  • 99
  • 616
  • 750