How is the contains method of the list class in Python implemented?

Question

Suppose I define the following variables:

mode = "access"
allowed_modes = ["access", "read", "write"]

I currently have a type checking statement which is

assert any(mode == allowed_mode for allowed_mode in allowed_modes)

However, it seems that I can replace this simply with

assert mode in allowed_modes

According to ThiefMaster's answer in Python List Class __contains__ Method Functionality, these two should be equivalent. Is this indeed the case? And how could I easily verify this by looking up Python's source code?

I found this: https://github.com/python/cpython/blob/master/Objects/listobject.c. See line 402. — 0Tech, Jan 31 '17 at 11:57
Yes, they're equivalent. The second (shorter) version should be very slightly faster. You can look at the source for [contains](https://github.com/python-git/python/blob/master/Objects/listobject.c#L430) and the list iterator's [next](https://github.com/python-git/python/blob/master/Objects/listobject.c#L2872). — wildwilhelm, Jan 31 '17 at 12:01

Stefan Pochmann · Accepted Answer · 2017-01-31T12:12:59.270

No, they're not equivalent. For example:

>>> mode = float('nan')
>>> allowed_modes = [mode]
>>> any(mode == allowed_mode for allowed_mode in allowed_modes)
False
>>> mode in allowed_modes
True

See Membership test operations for more details, including this statement:

For container types such as list, tuple, set, frozenset, dict, or collections.deque, the expression x in y is equivalent to any(x is e or x == e for e in y).

score 7 · Answer 2 · answered Jan 31 '17 at 11:57

7

Python lists are defined in C code.

You may verify it by looking at the code in the repository:

static int
list_contains(PyListObject *a, PyObject *el)
{
    Py_ssize_t i;
    int cmp;

    for (i = 0, cmp = 0 ; cmp == 0 && i < Py_SIZE(a); ++i)
        cmp = PyObject_RichCompareBool(el, PyList_GET_ITEM(a, i),
                                           Py_EQ);
    return cmp;
}

It's fairly straight forward to see that this code loops over items in list and stop when first equality (Py_EQ) comparison between el and PyList_GET_ITEM(a, i) returns 1.

answered Jan 31 '17 at 11:57

Łukasz Rogalski

22,092
8
59
93

2

I find this misleading, as you make it look like only "equality" is checked, so that the OP's two snippets would be equivalent. Which they're not, because *identity* is also checked, and that matters. See my answer and the note for [`PyObject_RichCompareBool`](https://docs.python.org/3/c-api/object.html#c.PyObject_RichCompareBool) saying "If o1 and o2 are the same object, PyObject_RichCompareBool() will always return 1 for Py_EQ and 0 for Py_NE". – Stefan Pochmann Jan 11 '20 at 16:32

score 5 · Answer 3 · answered Jan 31 '17 at 11:59

Not equivalent since the any requires an extra function call, a generator expression and things.

>>> mode = "access"
>>> allowed_modes =["access", "read", "write"]
>>> 
>>> def f1():
...    mode in allowed_modes
... 
>>> def f2():
...    any(mode == x for x in allowed_modes)
... 
>>> 
>>> 
>>> import dis
>>> dis.dis
dis.dis(          dis.disassemble(  dis.disco(        dis.distb(        
>>> dis.dis(f1)
  2           0 LOAD_GLOBAL              0 (mode)
              3 LOAD_GLOBAL              1 (allowed_modes)
              6 COMPARE_OP               6 (in)
              9 POP_TOP
             10 LOAD_CONST               0 (None)
             13 RETURN_VALUE
>>> dis.dis(f2)
  2           0 LOAD_GLOBAL              0 (any)
              3 LOAD_CONST               1 (<code object <genexpr> at 0x7fb24a957540, file "<stdin>", line 2>)
              6 LOAD_CONST               2 ('f2.<locals>.<genexpr>')
              9 MAKE_FUNCTION            0
             12 LOAD_GLOBAL              1 (allowed_modes)
             15 GET_ITER
             16 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             19 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             22 POP_TOP
             23 LOAD_CONST               0 (None)
             26 RETURN_VALUE
>>>

This is more instructive than the python source for the methods themselves but here is the source of __contains__ for lists and the loop is in C which will probably be faster than a Python loop.

Some timing numbers confirm this.

>>> import timeit
>>> timeit.timeit(f1)
0.18974408798385412
>>> timeit.timeit(f2)
0.7702703149989247
>>>

"not equivalent" depends on how you define "equivalence". Functionnaly, both solutions _are_ equivalent in that they will yield the same result. — bruno desthuilliers, Jan 31 '17 at 12:01
True but my intention was to find differences in as many ways as possible and highlight them. After that, you can discard those which are not relevant and then consider the two equivalent or otherwise. To me, it's quite clear that you should use f1 rather than f2 in this situation. — Noufal Ibrahim, Jan 31 '17 at 12:04
I had well understood your intention ;) - just wanted to make clear that both solutions do yield the same result (since the OP's didn't define "equivalent") and that both would do a sequential lookup with an equality test. Of course the containment test is the obvious pythonic solution (as was as the fastest). — bruno desthuilliers, Jan 31 '17 at 12:14
@brunodesthuilliers They don't always yield the same result. — Stefan Pochmann, Jan 31 '17 at 12:16

How is the contains method of the list class in Python implemented?

3 Answers3

Linked

Related

How is the __contains__ method of the list class in Python implemented?

3 Answers3

Linked

Related

How is the contains method of the list class in Python implemented?