I am looking to count the number of unique rows in a 3D NumPy array. Take the following array:
a = np.array([[[1, 2], [1, 2], [2, 3]], [[2, 3], [2, 3], [3, 4]], [[1, 2], [1, 2], [1, 2]]])
My desired output is a 1-D array of the same length as axis 0 of the 3-D array. array([2, 2, 1])
.
In this example, the output would be 2, 2, 1 because in the first grouping [1, 2] and [2, 3] are the unique values, in the second grouping [2, 3] and [3, 4] are the unique values, and in the third grouping [1, 2] is the "unique" values. Perhaps I'm using unique incorrectly in this context but that is what I'm looking to calculate.
The difficulty I'm having is that the count of unique rows will be different. If I use np.unique
, the result is broadcast as shown below:
>>> np.unique(a, axis=1)
array([[[1, 2],
[2, 3]],
[[2, 3],
[3, 4]],
[[1, 2],
[1, 2]]])
I know I can loop over each a 2D array and use np.apply_along_axis
, as described in this answer.
However, I am dealing with arrays as large as (1 000 000, 256, 2)
, so I would prefer to avoid loops if this is possible.