0

I have a numpy array:

  arr = array([[991.4, 267.3, 192.3],
               [991.4, 267.4, 192.3],
               [991.4, 267.4, 192.3],
               ...,
               [993.5, 268. , 192.6],
               [993.5, 268. , 192.6],
               [993.5, 268.1, 192.6]])

you can see there are some duplicates in this.

I have tried arr = np.unique(arr) but that returns:

array([192.3, 192.4, 192.5, 192.6, 266.6, 266.7, 266.8, 266.9, 267. ,
       267.1, 267.2, 267.3, 267.4, 267.5, 267.6, 267.7, 267.8, 267.9,
       268. , 268.1, 268.2, 268.3, 268.4, 268.5, 268.6, 268.7, 268.8,
       991.4, 991.5, 991.6, 991.7, 991.8, 991.9, 992. , 992.1, 992.2,
       992.3, 992.4, 992.5, 992.6, 992.7, 992.8, 992.9, 993. , 993.1,
       993.2, 993.3, 993.4, 993.5])

I need to retain the nested nature of the array, so compare each nested array to the other nested array, only then remove the duplicates, i.e.:

[991.4, 267.3, 192.3],
[991.4, 267.4, 192.3],
[991.4, 267.4, 192.3],

In the above there are 2 unique rows, after filtering it should be:

[991.4, 267.3, 192.3],
[991.4, 267.4, 192.3],
Alireza75
  • 513
  • 1
  • 4
  • 19
Spatial Digger
  • 1,883
  • 1
  • 19
  • 37
  • Does this answer your question? [Remove duplicate rows of a numpy array](https://stackoverflow.com/questions/31097247/remove-duplicate-rows-of-a-numpy-array) – jhso Dec 20 '22 at 01:15
  • you can convert your array to a pandas dataframe or series(series of lists). then you can use from pandas unique method, it supports uniqe method on lists. then you can convert it to numpy array again. note: series is faster than dataframe – Alireza75 Dec 20 '22 at 05:21

2 Answers2

0
new_data = np.unique(arr, axis=0)

I believe this should help as we only need to remove duplicate rows.

So providing additional parameter axis = 0 (row) and 1 (column)

-1

To remove duplicate rows in a NumPy array, you can use the unique function along with the axis parameter and the return_index parameter. The axis parameter specifies the axis along which the unique elements are computed, and the return_index parameter specifies whether to return the indices of the unique elements.