4

Consider the following list of two arrays:

from numpy import array

a = array([0, 1])
b = array([1, 0])

l = [a,b]

Then finding the index of a correctly gives

l.index(a)
>>> 0

while this does not work for b:

l.index(b)
ValueError: The truth value of an array with more than one element is ambiguous. 
Use a.any() or a.all()

It seems to me, that calling a list's .index function is not working for lists of numpy arrays.

Does anybody know an explanation? Up to now, I always solved this problem kind of daggy by converting the arrays to strings. Does someone know a more elegant and fast solution?

flonk
  • 3,726
  • 3
  • 24
  • 37
  • For a list of identical shaped arrays, you can have a NumPythonic search : `np.nonzero((l == b).all(1))[0][0]`. – Divakar Nov 24 '15 at 09:06
  • 4
    This happens because `.index()` first compares identity and then equality, it worked for `a` because `a is l[0]` is `True`, but when it comes to finding `b` it ends up with `b == l[0]`, and now Python calls `bool()` on it: `bool(b == l[0])`. So, `l.index(array([0, 1]))` will also raise the same error. I have explained it here: http://stackoverflow.com/a/28815970/846892 – Ashwini Chaudhary Nov 24 '15 at 09:08
  • Hi, this question duplicate has already been answered, http://stackoverflow.com/a/1156114/4061269 – Lior Dadon Nov 24 '15 at 09:09
  • 1
    @LiorDadon I think neither your suggested answer nor its underlying question adresses exactly the present problem. The OP there does not mention arrays, in contradiction to here, she even states that for a list of lists, that `['a','b'] in list` *does* work. – flonk Nov 24 '15 at 09:14
  • 1
    @NotAnAmbiTurner I call `.index` *on* `l`, which is still a list. The point here is that not `l` but its *elements* are arrays. – flonk Nov 24 '15 at 09:21
  • More info on the arrays in the input list might help for an efficient solution I think. Like are they just `1D` arrays? Are they identical shaped? Are they just numeric arrays? – Divakar Nov 24 '15 at 09:44

2 Answers2

1

The good question is in fact how l.index[a] can return a correct value. Because numpy arrays treat equality in a special manner: l[1] == b returns an array and not a boolean, by comparing individual values. Here it gives array([ True, True], dtype=bool) which cannot be directly converted to a boolean, hence the error.

In fact, Python uses rich comparison and specifically PyObject_RichCompareBool to compare the searched value to every element of the list is sequence, that means that it first test identity (a is b) and next equality (a == b). So for the first element, as a is l[0], identity is true and index 0 is returned.

But for any other element, identity with first element is false, and the equality test causes the error. (thanks to Ashwini Chaudhary for its nice explaination in comment).

You can confirm it by testing a new copy of an array containing same elements as l[0]:

d = array([0,1])
l.index(d)

it gives the same error, because identity is false, and the equality test raises the error.

It means that you cannot rely on any list method using comparison (index, in, remove) and must use custom functions such as the one proposed by @orestiss. Alternatively, as a list of numpy arrays seems hard to use, you should considere wrapping the arrays:

>>> class NArray(object):
    def __init__(self, arr):
        self.arr = arr
    def array(self):
        return self.arr
    def __eq__(self, other):
        if (other.arr is self.arr):
            return True
        return (self.arr == other.arr).all()
    def __ne__(self, other):
        return not (self == other)


>>> a = array([0, 1])
>>> b = array([1, 0])
>>> l = [ NArray(a), NArray(b) ]
>>> l.index(NArray(a))
0
>>> l.index(NArray(b))
1
Serge Ballesta
  • 143,923
  • 11
  • 122
  • 252
  • 1
    There's no special case for first element, [rich comparison](https://docs.python.org/3/c-api/object.html#c.PyObject_RichCompareBool) always compares identity first. – Ashwini Chaudhary Nov 24 '15 at 09:53
  • 1
    `PyObject_RichCompareBool ` is called for each and every item, `PyObject_RichCompareBool` compares identity first and if they are different objects it will call `PyObject_RichCompare` on them. `l.index(b)` basically fails because after comparing the identity of `a`(i.e `l[0]`) and `b` Python tries `==` on them and later calls `bool()` on its result which fails. So, it never even reaches the second index, otherwise it would've returned 1. – Ashwini Chaudhary Nov 24 '15 at 10:23
  • @AshwiniChaudhary: Thank you for this nice explaination. I've edited my post accordingly. – Serge Ballesta Nov 24 '15 at 11:11
0

This error comes from the way numpy treats comparison between array elements see : link,

So I am guessing that since the first element is the instance of the search you get the index for it, but trying to compare the first element with the second you get this error.

I think you could use something like:

[i for i, temp in enumerate(l) if (temp == b).all()]

to get a list with the indices of equal arrays but since I am no expert in python there could be a better solution (it seems to work...)

Community
  • 1
  • 1
orestiss
  • 2,183
  • 2
  • 19
  • 23