numpy ValueError when trying to sort list of np.ndarray with respect to list of ints

Question

I am trying to sort a list of numpy arrays with respect to a list of integers in ascending order, the problem discussed in this post. Specifically, I am using the top rated solution from the post.

This first example produces the intended solution:

>>> x1 = [np.array([1,2,3]),np.array([4,5,6]),np.array([7,8,9])]
>>> y1 = [6, 10 , 4]
>>> y1_sorted, x1_sorted = zip(*sorted(zip(y1, x1)))
>>> y1_sorted, x1_sorted
((4, 6, 10), (array([7, 8, 9]), array([1, 2, 3]), array([4, 5, 6])))

However, this second example, with variables seemingly of the same type, produces this error:

>>> x2 = [np.array([1, 2, 3]),
...                   np.array([1, 3, 2]),
...                   np.array([2, 1, 3]),
...                   np.array([2, 3, 1]),
...                   np.array([3, 1, 2]),
...                   np.array([3, 2, 1])]
>>> y2 = [6,3,7,1,3,8]
>>> y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Would anyone be able to explain what is happening? I am using numpy 1.20.3 with Python 3.8.12.

The answer you are working from uses lists, not arrays. The error means you, or in this case `sorted` is trying to compare arrays, and not getting the simple True/False value it needs. — hpaulj, Dec 29 '21 at 18:06

score 1 · Answer 1 · answered Dec 29 '21 at 18:13

So the line sorted(zip(y1, x1)) in the first part of the code seems to be sorting according to y1.

What you can do is use the the key argument of sorted to replicate that behaviour

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda _: _[0]))
print(y2_sorted)
# (1, 3, 3, 6, 7, 8)
print(x2_sorted)
# (array([2, 3, 1, 4, 5, 6]), array([1, 3, 2, 4, 5, 6]), array([3, 1, 2, 4, 5, 6]), array([1, 2, 3, 4, 5, 6]), array([2, 1, 3, 4, 5, 6]), array([3, 2, 1, 4, 5, 6]))

score 1 · Accepted Answer · answered Dec 29 '21 at 20:10

sorted function by default sorts tuples by the first elements and if there is a tie there, sort by second elements, and if there is still a tie, sort by third elements and so on.

In y2, 3 appears twice, so sorted will look into the second elements of the tuples to sort but the second elements are arrays, so it's not clear how to sort them, so you get an error. In other words, it's as if you ran the following:

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda x: (x[0], x[1])))

One way you can still use sorted function here is to simply sort by the first element as @niko suggested:

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda x: x[0]))

In this case, you only sort by the first elements (i.e. sort by y2) and leave the sorting of ties in y2 to the order it appears in.

Another way is to explicitly state how to use the information from the np.arrays. Maybe you want to sort by the first elements in the arrays in case there are ties in y2:

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda x: (x[0], x[1][0])))

Lastly, since you have a list of np.arrays, you can use numpy.argsort instead:

x2_sorted = np.array(x2)[np.argsort(y2)]

numpy ValueError when trying to sort list of np.ndarray with respect to list of ints

2 Answers2