Causes for inconsistent behavior when adding NaNs to a set

Question

There is puzzling (at least for me) behavior of Python's set in combination with NaNs (here live):

>>> float('nan') in {float('nan')}    # example 1
False
>>> nan = float('nan')                # example 2
>>> nan in {nan}
True

At first, I wrongly assumed,that this is the behavior of the ==-operator, but this is obviously not the case because both cases yield False as expected (here live):

>>> float('nan') == float('nan') 
False
>>> nan = float('nan')
>>> nan == nan
False

I'm mainly interested in the causes for this behavior. But if there is a way to ensure consistent behavior, that would also be nice to know!

Python's containers assume that `==` is an equivalence relation. If it's not, notions of containment fall apart. You shouldn't be putting NaN in a set in the first place. — user2357112, Jul 31 '18 at 21:07
Probably relevant: https://stackoverflow.com/questions/9089400/python-set-in-operator-uses-equality-or-identity — Oliver Charlesworth, Jul 31 '18 at 21:07
See also [here](https://stackoverflow.com/questions/20320022/why-in-numpy-nan-nan-is-false-while-nan-in-nan-is-true). — DSM, Jul 31 '18 at 21:16

wim · Accepted Answer · 2018-08-01T15:17:39.587

set membership does an identity check as a short-circuit before considering an equality check (CPython source is in setobject.c, see also the note below PyObject_RichCompareBool).

Python core devs are motivated by these invariants:

for a in container:
    assert a in container    # this should ALWAYS be true

Ergo:

assert a in [a]
assert a in (a,)
assert a in {a}

It was decided that ensuring these invariants was the most important priority, and as for NaN: oh well. Special cases aren't special enough to break the rules. For all the gory details, see bpo issue4296:

Python assumes identity implies equivalence; contradicts NaN.

Causes for inconsistent behavior when adding NaNs to a set

1 Answers1

Linked