0

I've got (what seems to me) a fairly clear-cut example with numpy argsort where it is producing odd results. If I create an example array of characters:

letters = np.array([['b','a','c'],
                    ['c','a','b'],
                    ['b','c','a']]).astype(str)

I'm then looking to sort along the rows (and to retain the sorting sequence, for another use later). The output I get from argsort is

sort_seq = np.argsort(letters, axis=1)
sort_seq
array([[1, 0, 2],
       [1, 2, 0],
       [2, 0, 1]])

This seems to get the first row right, but not the others. If I use it to reconstruct the array then I get:

output = np.full_like(letters, '')
np.put_along_axis(output, sort_seq, letters,axis=1)
output

which gives

array([['a', 'b', 'c'],
       ['b', 'c', 'a'],
       ['c', 'a', 'b']], dtype='<U1')

If I look here and on other sites I can see that argsorting for multi-dimensional arrays has at times not always worked well. But this example seems very close to the one given in the numpy documentation - surely it must work in this case?

Thanks for any assistance!

Chris J Harris
  • 1,597
  • 2
  • 14
  • 26

1 Answers1

2

Looks good to me:

In [88]: letters = np.array([['b','a','c'],
    ...:                     ['c','a','b'],
    ...:                     ['b','c','a']]).astype(str)
    ...: sort_seq = np.argsort(letters, axis=1)
    ...:                     
In [89]: np.take_along_axis(letters, sort_seq, axis=1)
Out[89]: 
array([['a', 'b', 'c'],
       ['a', 'b', 'c'],
       ['a', 'b', 'c']], dtype='<U1')

This may be a question of how we understand argsort. Is it the 'take' order, or the 'put' order?

To use put we have to use a double sort

In [91]: dblsort= np.argsort(sort_seq, axis=1)
In [92]: dblsort
Out[92]: 
array([[1, 0, 2],
       [2, 0, 1],
       [1, 2, 0]])
In [93]: res = np.zeros_like(letters)
In [94]: np.put_along_axis(res, dblsort, letters, axis=1)
In [95]: res
Out[95]: 
array([['a', 'b', 'c'],
       ['a', 'b', 'c'],
       ['a', 'b', 'c']], dtype='<U1')

numpy: find index in sorted array (in an efficient way)

hpaulj
  • 221,503
  • 14
  • 230
  • 353
  • thanks - I suspected this might be a case of me not knowing what I'm doing rather than a glaring flaw in numpy itself. I'll get to grips with put_along_axis vs take_along_axis. Thanks for helping fix this immediate problem. – Chris J Harris Feb 13 '19 at 01:31