0

How can I get the unique arrays from such a list below?

data =[np.array([ 10, 17]),
       np.array([ 10, 17]),
       np.array([ 1, 17, 34]),
       np.array([ 1, 17, 34]),
       np.array([ 20, 50, 38]),
       np.array([ 20, 50, 38]),
       np.array([ 20, 50, 40])]

expected = [ np.array([ 10, 17]),
             np.array([ 1, 17, 34]),
             np.array([ 20, 50, 38]),
             np.array([ 20, 50, 40])]

I applied set(data) but it gave me the error;

TypeError: unhashable type: 'numpy.ndarray'

I have got some ideas like:

  • Looping and appending an empty list and comparing the next candidate by using '==' with the prior list and so on.

  • Applying a mathematical operation on the arrays which gives unique output. Then removing the duplicates.

  • Using tobytes for each array for making them one line. And removing duplicates again.

All sounds inefficient to me. Any easy way to solve it?

datatech
  • 147
  • 8

1 Answers1

1

numpy arrays are unhashable but tuples are, so you can map them to tuples, cast to set, then map back to numpy arrays.

out = list(map(np.array, set(map(tuple, data))))

If the order the arrays appear in the list is important, you could also use dict.fromkeys:

out = list(map(np.array, dict.fromkeys(map(tuple, data)).keys()))

Output:

[array([10, 17]),
 array([ 1, 17, 34]),
 array([20, 50, 38]),
 array([20, 50, 40])]