x not in set versus x != each element in the set?

Question

Suppose we have x = ['a', 'b'].

What is happening under the hood for the statement:

x not in {None, False}

that raises the unhashable type: 'list' error?

The workaround I found is to write this instead:

x != None and x!= False

but I am confused because, mathematically, both boolean expressions are equivalent.

I'm sure this is a duplicate but I can't find a prior. It's a good question regardless. — Mad Physicist, Mar 13 '18 at 14:40
Out of curiosity, can you describe what you're actually trying to achieve in words? I suspect that neither option is really what you want. — Mad Physicist, Mar 13 '18 at 14:41
You mean to compare the list `x` right? (As opposed to the contents of `x`?) Like you don't mean `all([y not in {None, False} for y in x])`? — pault, Mar 13 '18 at 14:41
You are trying to find whether a list object is within a set of two values (meaning whether one of the two values in the set *is a list*). Lists cannot be hashed, so cannot be part of a set, so are not comparable. What do you want that operation to do? — deceze, Mar 13 '18 at 14:42
Is there some rationale for `[] in set()` raising an error rather than evaluating to `False`? — Patrick Haugh, Mar 13 '18 at 14:46
@Patrick Python warning you of nonsensical operations rather than quietly failing. — deceze, Mar 13 '18 at 14:48
@PatrickHaugh: Yes `[]` can't be in a `set()`. I did the same error in [\[SO\]: How to check whether a file exists? (@CristiFati's answer)](https://stackoverflow.com/questions/82831/how-to-check-whether-a-file-exists/44661513#44661513) (section 4.) — CristiFati, Mar 14 '18 at 03:20

CristiFati · Answer 1 · 2018-11-29T08:11:15.670

Rationale

Here's what the official documentation states:

[Python 3]: class set([iterable]):

Return a new set or frozenset object whose elements are taken from iterable. The elements of a set must be hashable.
[Python 3]: hashable:

An object is hashable if it has a hash value which never changes during its lifetime (it needs a __hash__() method), and can be compared to other objects (it needs an __eq__() method). Hashable objects which compare equal must have the same hash value.
...
All of Python’s immutable built-in objects are hashable; mutable containers (such as lists or dictionaries) are not.
[Python 3]: object.__contains__(self, item) (just above the anchor):

The membership test operators (in and not in) are normally implemented as an iteration through a sequence. However, container objects can supply the following special method with a more efficient implementation, which also does not require the object be a sequence.

Going into [GitHub]: python/cpython - (v3.5.4) cpython/Objects/setobject.c:

Line #1991:

static PyMethodDef set_methods[] = {
    {"add",             (PyCFunction)set_add,           METH_O,
     add_doc},
    {"clear",           (PyCFunction)set_clear,         METH_NOARGS,
     clear_doc},
    {"__contains__",(PyCFunction)set_direct_contains,           METH_O | METH_COEXIST,  // @TODO - cfati: MARK THIS LINE

Line #1843:

static PyObject *
set_direct_contains(PySetObject *so, PyObject *key)
{
    long result;

    result = set_contains(so, key);  // @TODO - cfati: MARK THIS LINE
    if (result == -1)
        return NULL;
    return PyBool_FromLong(result);
}

Line #1823:

static int
set_contains(PySetObject *so, PyObject *key)
{
    PyObject *tmpkey;
    int rv;

    rv = set_contains_key(so, key);    // @TODO - cfati: MARK THIS LINE
    if (rv == -1) {
        if (!PySet_Check(key) || !PyErr_ExceptionMatches(PyExc_TypeError))
            return -1;
        PyErr_Clear();
        tmpkey = make_new_set(&PyFrozenSet_Type, key);
        if (tmpkey == NULL)
            return -1;
        rv = set_contains_key(so, tmpkey);  // @TODO - cfati: MARK THIS LINE
        Py_DECREF(tmpkey);
    }
    return rv;
}

Line #627:

static int
set_contains_key(PySetObject *so, PyObject *key)
{
    setentry entry;
    Py_hash_t hash;

    if (!PyUnicode_CheckExact(key) ||
        (hash = ((PyASCIIObject *) key)->hash) == -1) {  // @TODO - cfati: MARK THIS LINE
        hash = PyObject_Hash(key);
        if (hash == -1)
            return -1;
    }
    entry.key = key;
    entry.hash = hash;
    return set_contains_entry(so, &entry);  // @TODO - cfati: MARK THIS LINE
}

Line #614:

static int
set_contains_entry(PySetObject *so, setentry *entry)
{
    PyObject *key;
    setentry *lu_entry;

    lu_entry = set_lookkey(so, entry->key, entry->hash);  // @TODO - cfati: MARK THIS LINE
    if (lu_entry == NULL)
        return -1;
    key = lu_entry->key;
    return key != NULL && key != dummy;
}

As seen from the "callstack" (presented in reversed order), in order to test for membership (in / not in), hash is being performed (on all code paths) on the candidate member ("includee"), and since the list instance doesn't have the hash functionality, the interpreter spits out TypeError.

Resolution

There is a number of ways to get around this (as many others already pointed out most of them):

Use a container that doesn't require its elements to be hashable (list, tuple)
Test for __hash__ member
Wrap the membership test in a try / except block
Use a a hashable container (tuple) for the element: x = ('a', 'b')

but (generally) these are just ways to get around the problem (this is my personal opinion), since if you end up comparing a list to None and False, the code (that yields that list) could use some refactoring.

Jean-François Fabre · Answer 2 · 2018-03-13T14:57:00.603

3

if you can enter all elements you want to test in a set, it means that all unhashable elements don't belong to your set (because you can't put them in)

You could do:

if x.__hash__ and x in {None,False}:

when object isn't hashable, x.__hash__ is None (other alternatives: Asking "is hashable" about a Python value) and the second part isn't evaluated.

or (better ask forgiveness than permission):

def belongs(x):
    try:
        return x in {None,False}
    except TypeError:   # unhashable type
        return False

both solutions are faster than using a list or tuple ((None,False)), because there's no linear search involved (that is if there are a lot of elements in the test list, not true for only 2 elements)

edited Mar 13 '18 at 14:57

answered Mar 13 '18 at 14:48

Jean-François Fabre

137,073
23
153
219

I would argue that `try: [] in {} except TypeError: ...` would be more pythonic. – deceze Mar 13 '18 at 14:49
@deceze yeah, I wanted to use short-circuit. Exceptions force to create a function. There are also other alternatives in the link I provided (like testing type aganst `collections.Hashable` but I suspect this is going to be slooooow) – Jean-François Fabre Mar 13 '18 at 14:50
1

Exceptions don't force the creation of a function; it depends on the context in which this code is used and how verbose/concise you want it. Perhaps you can abort your entire operation when a `TypeError` occurs or doesn't occur because obviously `x` is of the wrong type… – deceze Mar 13 '18 at 14:55

glibdud · Answer 3 · 2018-03-13T14:43:13.503

1

{None, False} is a set. Sets can only contain hashable objects, and therefore you can only test for membership of hashable objects. Lists are not hashable.

Instead, you could use a tuple to perform the same sort of membership comparison. Tuple elements do not need to be hashable.

x not in (None, False)

edited Mar 13 '18 at 14:43

answered Mar 13 '18 at 14:42

glibdud

7,550
4
27
37

6

so if `x` is unhashable, then it's not in the set :) – Jean-François Fabre Mar 13 '18 at 14:42
@Jean-FrançoisFabre Yes, that's true. – glibdud Mar 13 '18 at 14:43

score 0 · Answer 4 · answered Mar 13 '18 at 16:21

I would like to do a brief comparison on membership tests on set vs list

Membership tests invoke __contains__ dunder(if class implement this method). So, if we write

>>> 1 in [1,2]

it will equivalent to

>>> list.__contains__([1,2],1)
>>> True

If we do:

>>> [1,2] in [1,2,3]
>>> False #We get False instead of TypeError here

But why is above case not applicable to sets? Membership tests work in a different way in list and set. Infact list and set are implemented differently. Talking about set, they are implemented using Hash-Table. This allow sets to perform membership test i.e. lookups in O(1) as compared to list where lookups are O(n). So when in is performed on a set, __contains__ try to compute the hash of the object that need to be looked using __hash__. Since lists are unhashable in python, you get the error: TypeError: unhashable type: 'list'. If you do same with list you won't get any error since list don't compute hash for membership testing.

In short membership tests cannot be performed on sets with an object that is unhashable. Generally speaking all mutable objects(list, sets, dict) are unhashable.

x not in set versus x != each element in the set?

4 Answers4

Rationale

Resolution