8

Trying to index or remove a numpy array item from a python list, does not fail as expected on first item.

import numpy as np

# works:
lst = [np.array([1,2]), np.array([3,4])]
lst.index(lst[0])

# fails with: ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
lst = [np.array([1,2]), np.array([3,4])]
lst.index(lst[1])

I understand why the second fails, I would like to understand why the first one works.

o17t H1H' S'k
  • 2,541
  • 5
  • 31
  • 52
  • Seems that inside the `lst.index(x)` built-in function, somewhere there's a `x == lst[i]`. Which is a comparison between two arrays, that returns an array of bools, while the `index` function is excepting a single bool value. – ArrowRise Aug 01 '22 at 08:42
  • 1
    of course, so why does the first one pass? – o17t H1H' S'k Aug 01 '22 at 08:47
  • Yeah, the first one shouldn't give you the right result. The answer in the source of this `index` method – ArrowRise Aug 01 '22 at 08:54
  • 4
    It's an optimization: first check is [reference equality](https://github.com/python/cpython/blob/main/Objects/object.c#L739) (roughly similar to `id(lst[0]) == id([lst[0])`), if that fails an elementwise check is performed. That fails for `lst[0] == lst[1]` if those are `numpy` arrays. – Michael Szczesny Aug 01 '22 at 08:56
  • @michael wanna put that in an answer? – o17t H1H' S'k Aug 01 '22 at 09:04
  • 1
    @o17tH1H'S'k - I'd rather find the dupe. It's hard to search, but this has been answered before. – Michael Szczesny Aug 01 '22 at 09:06
  • 1
    Sorry, but `Guarantees that identity implies equality` is what a buggy logic... `l = [inf, nan]; i = float("inf"); n = float("nan"); print(l.index(i), 'Works'); print(l.index(nan), 'Works'); print(l.index(n), 'Fails')` – Askold Ilvento Aug 01 '22 at 09:17
  • @AskoldIlvento - That looks like an interesting edge-case, but it's not reproducible as *inf*, *nan* are not defined. `nan` is a special case, it has no equality or identity. – Michael Szczesny Aug 01 '22 at 09:21
  • 1
    @MichaelSzczesny, I meant `from numpy import nan, inf` – Askold Ilvento Aug 01 '22 at 09:42
  • 1
    Funny point: a similar problem can be reproduced without Numpy, only pure Python: `import math; lst = [4.2, math.nan]; lst.index(math.nan); lst.index(math.nan+1)`. The former is fine while not the later. This is because there is not just 1 IEEE754 NaN but a lot. Meanwhile `math.nan == math.nan` is `False`, `math.nan == math.nan+1` is `False` (the same thing for `!=`) and `id(np.nan) == id(np.nan+1)` also `False`. This is certainly a *huge source a bugs* in numerical codes (especially since Numpy gives different results and conversion are quite frequent). – Jérôme Richard Aug 01 '22 at 11:16
  • @MichaelSzczesny. That does not come even close to addressing OP's concern – Mad Physicist Aug 05 '22 at 19:28
  • @MichaelSzczesny. OP is aware of why the second option fails. The way I understand it, the question is about why the first one works – Mad Physicist Aug 05 '22 at 19:33
  • @MichaelSzczesny. That being said, the answer to the second dupe does explain what is happening – Mad Physicist Aug 05 '22 at 19:34

1 Answers1

3

Because it's not leveraging the array's eq method, but rather just comparing the ids of the two objects:

lst = [np.array([1,2]), np.array([3,4])]
lst[0]                                  => array([1, 2])
lst[1]                                  => array([3, 4])
lst[0] == lst[1]                        => array([False, False], dtype=bool)
id(lst[0])                              => 140232956554776
id(lst[1])                              => 140232956554960
lst.index(lst[0])                       => 0
lst.index(lst[1])                       => 1
lst.index(lst[0]) == lst.index(lst[1])  => False

To get the behavior you want, you need to make a comparison that leverages the array's eq, such as:

lst.index(lst[0].tolist()) => 0
lst.index(lst[1].tolist()) => 1

Or:

lst.index(lst[0].tostring()) => 0
lst.index(lst[1].tostring()) => 1

But this is not going to be an efficient way to work with the data, since you'll be converting the arrays to lists or strings.

Constantin
  • 848
  • 8
  • 23