0

I want to iterate over an array of arrays and skip to the next array if I've already read the same array. The following code works, but I'm searching a more 'pythonic' style solution.

from sklearn import datasets
import numpy as np
iris = datasets.load_iris()
X = iris.data[:, :2]

read = []
for x in X:
    temp = True
    for r in read:
        if np.array_equal(x, r):
            temp = False
    if temp:
        read.append(x)
        # do some stuff

Type and content of X:

>>> type(X)
<class 'numpy.ndarray'>

>>> X
array([[5.1, 3.5],
   [4.9, 3. ],
   [4.9, 3. ]
   [4.7, 3.2],
   [4.6, 3.1],
   [5. , 3.6],
   ...
   [5.9, 3. ]])

For example, when I read [4.9, 3. ] the first time I do some stuff. When I read [4.9, 3. ] again I skip to the next array.

user3420714
  • 138
  • 1
  • 7

1 Answers1

0

You can use numpy.unique along axis=0. To preserve order, you can extract indices, sort them and index your array with the sorted indices. Then just iterate over the result.

Here's a minimal example:

A = np.array([[5.1, 3.5],
              [4.9, 3. ],
              [4.9, 3. ],
              [4.7, 3.2]])

_, idx = np.unique(A, axis=0, return_index=True)

print(A[np.sort(idx)])

array([[5.1, 3.5],
       [4.9, 3. ],
       [4.7, 3.2]])
jpp
  • 159,742
  • 34
  • 281
  • 339