0

I have a numpy array (in Python 3) having say 4 unique values plus 0s in it (the number of unique values is dynamic and can change). Currently, the mapping between unique values and their indices is being done as follows:

For example, the given np array (3, 3, 3) has values-

array([[[ 0.        ,  0.        ,  0.        ],
        [ 0.        ,  0.        ,  0.        ],
        [ 0.13891649,  0.        , -0.14964542]],

       [[ 0.14316492,  0.16461118,  0.10582511],
        [ 0.09232809,  0.        ,  0.        ],
        [ 0.        ,  0.        ,  0.        ]],

       [[ 0.        ,  0.        ,  0.        ],
        [ 0.        , -0.13443717,  0.        ],
        [ 0.        , -0.15175688,  0.        ]]], dtype=float32)

This array has 4 unique value (without 0)-

array([-0.1284381 , -0.1008032 ,  0.        ,  0.02159981,  0.08796175],
      dtype=float32)

Current mapping for unique values and their indices is done by:

wts, count = np.unique(x, return_counts=True)
unique_counts_cl1 = dict(zip(wts, count))

unique_counts_cl1
# {-0.1284381: 1, -0.1008032: 2, 0.0: 19, 0.02159981: 3, 0.08796175: 2}

cl1_wts_mapping = {}
wt_indx = 1

for wt in unique_counts_cl1.keys():
    # print(wt, list(unique_counts.keys())[wt_indx - 1])
    cl1_wts_mapping['wt' + str(wt_indx)] = np.where(x == list(unique_counts_cl1.keys())[wt_indx - 1])
    wt_indx += 1


# This creates the mapping-
cl1_wts_mapping                                                        
'''
{'wt1': (array([0]), array([2]), array([2])),
 'wt2': (array([2, 2]), array([1, 2]), array([1, 1])),
 'wt3': (array([0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2]),
  array([0, 0, 0, 1, 1, 1, 2, 1, 1, 2, 2, 2, 0, 0, 0, 1, 1, 2, 2]),
  array([0, 1, 2, 0, 1, 2, 1, 1, 2, 0, 1, 2, 0, 1, 2, 0, 2, 0, 2])),
 'wt4': (array([1, 1, 1]), array([0, 0, 1]), array([0, 2, 0])),
 'wt5': (array([0, 1]), array([2, 0]), array([0, 1]))}
'''

# To access 2nd unique weight-
cl1_wts_mapping['wt2']                                                 
# (array([2, 2]), array([1, 2]), array([1, 1]))

Is there a better/efficient way to achieve the mapping?

Thanks!

Arun
  • 2,222
  • 7
  • 43
  • 78
  • Odds are some of the optional keyword arguments to `unique` will help you. – Andras Deak -- Слава Україні Jun 17 '20 at 13:39
  • @AndrasDeak updated question – Arun Jun 17 '20 at 13:58
  • 1
    The hard part is https://stackoverflow.com/questions/54734545/indices-of-unique-values-in-n-dimensional-array. Then you have to take that array of indices and post-process it if you want a dict of indices. But generating those string keys sounds like a bad idea in general, you're probably better off keeping the pair of `unique_values, list_of_indices` and use them together. And watch out with exact equality testing of float values, that can give you unexpected problems due to floating point errors. – Andras Deak -- Слава Україні Jun 17 '20 at 14:57
  • @AndrasDeak can you provide a sample code to explain your comment about generating string keys as a bad idea, why? I would like to see the code for `unique_values, list_of_indices' together example. – Arun Jun 17 '20 at 19:35
  • Use either answer on the linked question to get the `ixs` or `indices` array/list that contains the indices you seek. Then you can create your dict with `cl1_wts_mapping = {f'wt{i}': ind for i,ind in enumerate(ixs, 1)}` or something like that. As for the "two arrays together": I just meant keeping both, so that whenever you have to look up "the first unique value" you know the value is `unique_values[0]` and the corresponding indices are `ixs[0]`. No indirection needed from go from 0 to `'wt1'` to the index array. – Andras Deak -- Слава Україні Jun 17 '20 at 19:39
  • One wrinkle about determining unique *float* values rather than just integer or string is that it's possible for float values to be non-unique in the least significant digits [beyond the current precision shown by np](https://numpy.org/doc/stable/reference/generated/numpy.set_printoptions.html). – smci Jun 17 '20 at 21:07
  • @smci this is why I chose to generate string keys to avoid all this. But I don't get it why it's a bad idea to use string keys (as pointed out by 'Andras Deak'). – Arun Jun 18 '20 at 08:44

0 Answers0