38

Given an array 'a' I would like to sort the array by columns sort(a, axis=0) do some stuff to the array and then undo the sort. By that I don't mean re sort but basically reversing how each element was moved. I assume argsort() is what I need but it is not clear to me how to sort an array with the results of argsort() or more importantly apply the reverse/inverse of argsort()

Here is a little more detail

I have an array a, shape(a) = rXc I need to sort each column

aargsort = a.argsort(axis=0)  # May use this later
aSort = a.sort(axis=0)

now average each row

aSortRM = asort.mean(axis=1)

now replace each col in a row with the row mean. is there a better way than this

aWithMeans = ones_like(a)
for ind in range(r)  # r = number of rows
    aWithMeans[ind]* aSortRM[ind]

Now I need to undo the sort I did in the first step. ????

sophros
  • 14,672
  • 11
  • 46
  • 75
Vincent
  • 1,579
  • 4
  • 23
  • 38
  • 1
    Why can't you just make a copy of the array: `a.copy()` before any transformations or use `aSort = numpy.sort(axis=0)` (that will return sorted copy)? btw, `a.sort()` returns nothing therefore there is no point to assign its return value. – jfs Mar 20 '10 at 16:05
  • @J.F. Sebastian, Thanks you are right I fixed it – Vincent Mar 20 '10 at 16:46

7 Answers7

81

There are probably better solutions to the problem you are actually trying to solve than this (performing an argsort usually precludes the need to actually sort), but here you go:

>>> import numpy as np
>>> a = np.random.randint(0,10,10)
>>> aa = np.argsort(a)
>>> aaa = np.argsort(aa)
>>> a # original
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
>>> a[aa] # sorted
array([0, 2, 4, 4, 4, 4, 5, 6, 6, 7])
>>> a[aa][aaa] # undone
array([6, 4, 4, 6, 2, 5, 4, 0, 7, 4])
Paul
  • 42,322
  • 15
  • 106
  • 123
12

For all those still looking for an answer:

In [135]: r = rand(10)

In [136]: i = argsort(r)

In [137]: r_sorted = r[i]

In [138]: i_rev = zeros(10, dtype=int)

In [139]: i_rev[i] = arange(10)

In [140]: allclose(r, r_sorted[i_rev])

Out[140]: True
Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
jesse
  • 121
  • 1
  • 2
9

I'm not sure how best to do it in numpy, but, in pure Python, the reasoning would be:

aargsort is holding a permutation of range(len(a)) telling you where the items of aSort came from -- much like, in pure Python:

>>> x = list('ciaobelu')
>>> r = range(len(x))
>>> r.sort(key=x.__getitem__)
>>> r
[2, 4, 0, 5, 1, 6, 3, 7]
>>> 

i.e., the first argument of sorted(x) will be x[2], the second one x[4], and so forth.

So given the sorted version, you can reconstruct the original by "putting items back where they came from":

>>> s = sorted(x)
>>> s
['a', 'b', 'c', 'e', 'i', 'l', 'o', 'u']
>>> original = [None] * len(s)
>>> for i, c in zip(r, s): original[i] = c
... 
>>> original
['c', 'i', 'a', 'o', 'b', 'e', 'l', 'u']
>>> 

Of course there are going to be tighter and faster ways to express this in numpy (which unfortunately I don't know inside-out as much as I know Python itself;-), but I hope this helps by showing the underlying logic of the "putting things back in place" operation you need to perform.

Alex Martelli
  • 854,459
  • 170
  • 1,222
  • 1,395
8

Super late to the game, but here:

import numpy as np
N = 1000 # or any large integer
x = np.random.randn( N )
I = np.argsort( x )
J = np.argsort( I )
print( np.allclose( x[I[J]] , x ) )
>> True

Basically, argsort the argsort because the nth element of the reverse sort is J[n] = k : I[k] = n. That is, I[J[n]] = n, so J sorts I.

W. Ross Morrow
  • 101
  • 1
  • 3
2

indices=np.argsort(a) gives you the sorting indices, such that x = a[indices] is the sorted array. y = b[indices] pushs forward the array b to sorted domain. c[indices] = z pulls back z from sorted domain into c in source domain.

For instance,

import numpy as np

n = 3
a = np.random.randint(0,10,n) # [1, 5, 2]
b = np.random.randn(n) # [-.1, .5, .2]
c = np.empty_like(a)

# indices that sort a: x=a[indices], x==np.sort(a) is all True
indices = np.argsort(a) # [0,2,1]

# y[i] is the value in b at the index of the i-th smallest value in a
y = b[indices] # [-.1, .2, .5]

# say z[i] is some value related to the i-th smallest entry in a 
z = np.random.randn(n) # [-1.1, 1.2, 1.3]
c[indices] = z # inverted the sorting map, c = [-1.1, 1.3, 1.2] 
talbon
  • 21
  • 2
1

I was not able to follow your example, but the more abstract problem--i.e., how to sort an array then reverse the sort--is straightforward.

import numpy as NP
# create an 10x6 array to work with
A = NP.random.randint(10, 99, 60).reshape(10, 6)
# for example, sort this array on the second-to-last column, 
# breaking ties using the second column (numpy requires keys in
# "reverse" order for some reason)
keys = (A[:,1], A[:,4])
ndx = NP.lexsort(keys, axis=0)
A_sorted = NP.take(A, ndx, axis=0)

To "reconstruct" A from A_sorted is trivial because remember that you used an index array ('ndx') to sort the array in the first place.

# ndx array for example above:  array([6, 9, 8, 0, 1, 2, 4, 7, 3, 5])

In other words, the 4th row in A_sorted was the 1st row in the original array, A, etc.

doug
  • 69,080
  • 24
  • 165
  • 199
  • I actually want to sort each column individually, I correct my code at the top but I need to work with np.sort(A, axis=0) so the inex would be np.argsort(x, axis=0) – Vincent Mar 20 '10 at 16:51
1

A faster alternative to getting the inverse of argsort, inspired by Alex Martelli

def get_inv (argsort):
    argsort_inv = np.arange(len(argsort))
    argsort_inv[argsort] = argsort_inv.copy()
    return argsort_inv

For arrays greater in size than 1E4 I see a 10x performance gain