I have a 2D array, and it has some duplicate columns. I would like to be able to see which unique columns there are, and where the duplicates are.
My own array is too large to put here, but here is an example:
a = np.array([[ 1., 0., 0., 0., 0.],[ 2., 0., 4., 3., 0.],])
This has the unique column vectors [1.,2.]
, [0.,0.]
, [0.,4.]
and [0.,3.]
. There is one duplicate: [0.,0.]
appears twice.
Now I found a way to get the unique vectors and their indices here but it is not clear to me how I would get the occurences of duplicates as well. I have tried several naive ways (with np.where
and list comps) but those are all very very slow. Surely there has to be a numpythonic way?
In matlab it's just the unique
function but np.unique
flattens arrays.