0

I have the following question. Is there somekind of method with numpy or scipy , which I can use to get an given unsorted array like this

a = np.array([0,0,1,1,4,4,4,4,5,1891,7]) #could be any number here

to something where the numbers are interpolated/mapped , there is no gap between the values and they are in the same order like before?:

[0,0,1,1,2,2,2,2,3,5,4]

EDIT

Is it furthermore possible to swap/shuffle the numbers after the mapping, so that

[0,0,1,1,2,2,2,2,3,5,4]

become something like:

[0,0,3,3,5,5,5,5,4,1,2]
Liwellyen
  • 433
  • 1
  • 6
  • 19
  • I don't understand how you get from A to B. How do you know, for example, how many 2's to put in? Or Whether there should be a 7? – Robᵩ Mar 23 '18 at 22:51
  • I have a given array like the first, and want to map the numbers so that there is no gap between the values , so the 4 '4s' become the 2s – Liwellyen Mar 23 '18 at 22:54
  • Are you assuming your original array is sorted? – hoyland Mar 23 '18 at 22:56
  • no unfortunately its not sorted, changed it a bit – Liwellyen Mar 23 '18 at 22:57
  • Duplicate of [Rank items in an array using Python/NumPy, without sorting array twice - Stack Overflow](https://stackoverflow.com/questions/5284646/rank-items-in-an-array-using-python-numpy-without-sorting-array-twice) ; however this question explicitly ask for `method="dense"`. – user202729 Jan 11 '21 at 10:48
  • And this is a subset of https://stackoverflow.com/questions/24315246/convert-numpy-array-to-order-of-elements-when-duplicate-values-are-present (only 1D) – user202729 Jan 11 '21 at 10:57

2 Answers2

1

Edit: I'm not sure what the etiquette is here (should this be a separate answer?), but this is actually directly obtainable from np.unique.

>>> u, indices = np.unique(a, return_inverse=True)
>>> indices
array([0, 0, 1, 1, 2, 2, 2, 2, 3, 5, 4])

Original answer: This isn't too hard to do in plain python by building a dictionary of what index each value of the array would map to:

x = np.sort(np.unique(a))
index_dict = {j: i for i, j in enumerate(x)}
[index_dict[i] for i in a]
hoyland
  • 1,776
  • 14
  • 14
1

Seems you need to rank (dense) your array, in which case use scipy.stats.rankdata:

from scipy.stats import rankdata
rankdata(a, 'dense')-1
# array([ 0.,  0.,  1.,  1.,  2.,  2.,  2.,  2.,  3.,  5.,  4.])
Psidom
  • 209,562
  • 33
  • 339
  • 356