I have a 3D numpy array like this:
>>> a
array([[[0, 1, 2],
[0, 1, 2],
[6, 7, 8]],
[[6, 7, 8],
[0, 1, 2],
[6, 7, 8]],
[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]])
I want to remove only those rows which contain duplicates within themselves. For instance the output should look like this:
>>> remove_row_duplicates(a)
array([[[0, 1, 2],
[3, 4, 5],
[6, 7, 8]]])
This is the function that I am using:
delindices = np.empty(0, dtype=int)
for i in range(len(a)):
_, indices = np.unique(np.around(a[i], decimals=10), axis=0, return_index=True)
if len(indices) < len(a[i]):
delindices = np.append(delindices, i)
a = np.delete(a, delindices, 0)
This works perfectly, but the problem is now my array shape is like (1000000,7,3). The for loop is pretty slow in python and this take a lot of time. Also my original array contains floating numbers. Any one who has a better solution or who can help me vectorizing this function?