I suspect I am misunderstanding something.
Problem: Given a series, I want to return a new series where the value at each row would be the index if that series was sorted.
I posted a different question and seemed like argsort
was the right solution. But after reading about argsort, I believe it is not. Here is the doc.
Returns the indices that would sort an array.
Here is an example:
test = pd.Series(np.random.randint(20, size=10), index=['red', 'green', 'yellow', 'purple', 'orange', 'white', 'black', 'pink', 'brown', 'gray'])
>>> test
red 2
green 17
yellow 8
purple 19
orange 12
white 0
black 15
pink 5
brown 14
gray 14
>>> test.argsort()
red 5
green 0
yellow 7
purple 2
orange 4
white 8
black 9
pink 6
brown 1
gray 3
But what I actually want is the index for each color as if it was sorted. For example, if I do test.sort_values()
>>> test.sort_values()
white 0
red 2
pink 5
yellow 8
orange 12
brown 14
gray 14
black 15
green 17
purple 19
dtype: int64
This makes sense because it will produce same results as test[test.argsort()]
.
So what do I do to get something like?
red 1
green 8
yellow 3
purple 9
orange 4
white 0
black 7
pink 2
brown 5
gray 6
Similar question to Numpy argsort - what is it doing? but I don't think it was ever answered to what I want the function to do.
I hope this makes sense.