I have two lists from which I need to find the indices associated with unique pairs (all the SO posts I could find are only interested in the pairs themselves). I've been trying to use numpy.unique
to do so, but am hitting an oddity. I zipped the lists to create a list of tuples, which then set()
and np.unique()
successfully pare down to only the unique pairs, but what I want is the indices into the original list. The documentation for unique
indicates that it will return those if return_inverse=True
. However, I am getting different levels of "flattening" if that is set or not.
In this example I use strings just to avoid any comparison issues, in reality they are floats.
import numpy as np
l_1 = ['12.34', '12.34', '12.34', '12.34', '56.78', '56.78', '90.12', '90.12']
l_2 = ['-1.23', '-1.23', '-4.56', '-4.56', '-6.78', '-6.78', '-9.01', '-9.01']
ll = zip(l_1, l_2)
ull1 = np.unique(ll)
ull2, inds = np.unique(ll, return_inverse=True)
In the first case the pairs are preserved as a second dimension in the output. In the second case even the tuples are flattened out, thus destroying the pairs.
In [1]: ull1
Out[1]:
array([['-9.01', '90.12'],
['-1.23', '12.34'],
['-6.78', '56.78'],
['-4.56', '12.34']],
dtype='|S5')
In [2]: ull2
Out[2]:
array(['-1.23', '-4.56', '-6.78', '-9.01', '12.34', '56.78', '90.12'],
dtype='|S5')
Is this done on purpose? Is there some way to make unique
give me the indices that I want in the first case (which would be something like [[6,7], [0,1], [4,5], [2,3]]
)? I can't tell from the documentation if the former or latter behavior is the odd one out.
I need the indices to operate on other values from similar lists. If I had access to pandas I would use it, but the computer I have to run on only has a very old version of numpy and no pandas. However, this same thing still happens in numpy 1.8.1. I know that I could do something like the following:
sll = list(set(ll))
for i in range(len(sll)):
inds = np.where([val == sll[i] for val in ll])
# I do my operations here using inds
but I'm hoping there may be something more elegant?