I have an array/set with unique positive integers, i.e.
>>> unique = np.unique(np.random.choice(100, 4, replace=False))
And an array containing multiple elements sampled from this previous array, such as
>>> A = np.random.choice(unique, 100)
I want to map the values of the array A
to the position of which those values occur in unique
.
So far the best solution I found is through a mapping array:
>>> table = np.zeros(unique.max()+1, unique.dtype)
>>> table[unique] = np.arange(unique.size)
The above assigns to each element the index on the array, and thus, can be used later to map A
through advanced indexing:
>>> table[A]
array([2, 2, 3, 3, 3, 3, 1, 1, 1, 0, 2, 0, 1, 0, 2, 1, 0, 0, 2, 3, 0, 0, 0,
0, 3, 3, 2, 1, 0, 0, 0, 2, 1, 0, 3, 0, 1, 3, 0, 1, 2, 3, 3, 3, 3, 1,
3, 0, 1, 2, 0, 0, 2, 3, 1, 0, 3, 2, 3, 3, 3, 1, 1, 2, 0, 0, 2, 0, 2,
3, 1, 1, 3, 3, 2, 1, 2, 0, 2, 1, 0, 1, 2, 0, 2, 0, 1, 3, 0, 2, 0, 1,
3, 2, 2, 1, 3, 0, 3, 3], dtype=int32)
Which already gives me the proper solution. However, if the unique numbers in unique
are very sparse and large, this approach implies creating a very large table
array just to store a few numbers for later mapping.
Is there any better solution?
NOTE: both A
and unique
are sample arrays, not real arrays. So the question is not how to generate positional indexes, it is just how to efficiently map elements of A
to indexes in unique
, the pseudocode of what I'd like to speedup in numpy is as follows,
B = np.zeros_like(A)
for i in range(A.size):
B[i] = unique.index(A[i])
(assuming unique
is a list in the above pseudocode).