Numpy: need a hand in understanding what happens with the "in" operator

Question

I would appreciate if somebody could help me with this (and explaining what's going on).

This works:

>>> from numpy import array
>>> a = array((2, 1))
>>> b = array((3, 3))
>>> l = [a, b]
>>> a in l
True

But this does not:

>>> c = array((2, 1))
>>> c in l
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

The behaviour I would like to replicate is:

>>> x = (2, 1)
>>> y = (3, 3)
>>> l2 = [x, y]
>>> z = (2, 1)
>>> z in l2
True

Note that what above also work with mutable objects:

>>> x = [2, 1]
>>> y = [3, 3]
>>> l2 = [x, y]
>>> z = [2, 1]
>>> z in l2
True

Of course, knowing that:

>>> (a < b).all()
True

I tried (and failed):

>>> (c in l).all()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

unutbu · Accepted Answer · 2011-11-04T22:30:50.960

6

Python makes the choice that bool([False,True]) is True because (it says) any non-empy list has boolean value True.

Numpy makes the choice that bool(np.array([False, True])) should raise a ValueError. Numpy was designed from the point of view that some users may want to know if any of the elements in the array are True, while others may want to know if all the elements in the array are True. Since the users may have conflicting desires, NumPy refuses to guess. It raises a ValueError and suggests using np.any or np.all (though if one wishes to replicate Python-like behavior, you'd use len).

When you evaluate c in l, Python compares c with each element in l starting with a. It evaluates bool(c==a). We get bool(np.array([True True])), which raises a ValueError (for the reason described above).

Since numpy refuses to guess, you have to be specific. I suggest:

import numpy as np
a=np.array((2,1))
b=np.array((3,3))
c=np.array((2,1))
l=[a,b]
print(any(np.all(c==elt) for elt in l))
# True

edited Nov 04 '11 at 22:30

answered Nov 03 '11 at 03:29

unutbu

842,883
184
1,785
1,677

Very helpful, thank you (+1)... I am still not getting why `a in l` gets a different result than `c in l` though. In both cases the command should test a numpy array in a standard list... what is the difference I can't see? – mac Nov 03 '11 at 13:39
@mac: That is a great question, and I don't know the answer. [The docs say](http://docs.python.org/reference/expressions.html#notin), "For the list and tuple types, x in y is true if and only if there exists an index i such that x == y[i] is true.", but `bool(a==l[0])` raises a ValueError and yet `a in l` returns True. I don't know how to resolve this (seeming) contradiction. – unutbu Nov 03 '11 at 14:16
5

`list.__contains__()` checks for identity first, then equality. So `a in [a,b]` returns True only because the first item is `a`. Anything else, even a different array object that happens to have all of the same values of `a` will evaluate the equality comparison, get an array, and then get an exception when it tries to cast that array to a bool. – Robert Kern Nov 04 '11 at 21:30
@RobertKern: [Oh, I see](http://hg.python.org/cpython/rev/d27f95e3b52f)... Thanks very much for the explanation. – unutbu Nov 04 '11 at 22:03
@RobertKern thank you. That clarifies. I also found [this](http://stackoverflow.com/questions/1322380/gotchas-where-numpy-differs-from-straight-python/6065203#6065203) to be interesting, although not the answer I was looking for. I'm selecting the answer as accepted now: you share the moral glory for that though! :) – mac Nov 05 '11 at 00:00

Numpy: need a hand in understanding what happens with the "in" operator

1 Answers1

Linked