Is there a pandas equivalent nunique
row wise in numpy? I checked out np.unique
with return_counts
but it doesn't seem to return what I want. For example
a = np.array([[120.52971, 75.02052, 128.12627], [119.82573, 73.86636, 125.792],
[119.16805, 73.89428, 125.38216], [118.38071, 73.35443, 125.30198],
[118.02871, 73.689514, 124.82088]])
uniqueColumns, occurCount = np.unique(a, axis=0, return_counts=True) ## axis=0 row-wise
The results:
>>>ccurCount
array([1, 1, 1, 1, 1], dtype=int64)
I should be expecting all 3
as opposed to all 1
.
The work around of course is convert to pandas and call nunique
but there is a speed issue and I want to explore a pure numpy implementation to speed things up. I am working with large dataframes so hoping to find speedups whereever I can. I am open to other solutions too for speed up.