31

I'm trying to convert all my codes to Python. I want to sort an array which has two columns so that the sorting must be based on the 2th column in the ascending order. Then I need to sum the first column data (from first line to, for example, 100th line). I used "Data.sort(axis=1)", but it doesn't work. Does anyone have any idea to solve this problem?

Sam
  • 331
  • 1
  • 3
  • 3

2 Answers2

62

Use .argsort() it returns an numpy.array of indices that sort the given numpy.array. You call it as a function or as a method on your array. For example, suppose you have

import numpy as np

arr = np.array([[-0.30565392, -0.96605562],
                [ 0.85331367, -2.62963495],
                [ 0.87839643, -0.28283675],
                [ 0.72676698,  0.93213482],
                [-0.52007354,  0.27752806],
                [-0.08701666,  0.22764316],
                [-1.78897817,  0.50737573],
                [ 0.62260038, -1.96012161],
                [-1.98231706,  0.36523876],
                [-1.07587382, -2.3022289 ]])

You can now call .argsort() on the column you want to sort, and it will give you an array of row indices that sort that particular column which you can pass as an index to your original array.

>>> arr[arr[:, 1].argsort()]
array([[ 0.85331367, -2.62963495],
       [-1.07587382, -2.3022289 ],
       [ 0.62260038, -1.96012161],
       [-0.30565392, -0.96605562],
       [ 0.87839643, -0.28283675],
       [-0.08701666,  0.22764316],
       [-0.52007354,  0.27752806],
       [-1.98231706,  0.36523876],
       [-1.78897817,  0.50737573],
       [ 0.72676698,  0.93213482]])

You can equivalently use numpy.argsort()

>>> arr[np.argsort(arr[:, 1])]
array([[ 0.85331367, -2.62963495],
       [-1.07587382, -2.3022289 ],
       [ 0.62260038, -1.96012161],
       [-0.30565392, -0.96605562],
       [ 0.87839643, -0.28283675],
       [-0.08701666,  0.22764316],
       [-0.52007354,  0.27752806],
       [-1.98231706,  0.36523876],
       [-1.78897817,  0.50737573],
       [ 0.72676698,  0.93213482]])
JaminSore
  • 3,758
  • 1
  • 25
  • 21
4

sorted(Data, key=lambda row: row[1]) should do it.

a p
  • 3,098
  • 2
  • 24
  • 46
  • Using this command, I have same problem as before, which is duplicating in sorting. If the input data is: Data=[1.0 0.70 0.0 0.69 3.0 0.57 0.0 0.68 1.0 0.56 2.0 0.51] The sorting results are: [[0.0', '0.68'], ['0.0', '0.69'], ['0.70', '1.0'],['0.56', '1.0'], ['0.51', '2.0'], ['0.57', '3.0'] Do you have another idea? – Sam Mar 27 '14 at 21:43
  • I'm afraid I don't quite understand the problem. What's duplicated? If your input `Data` is a flat list, why would sorting it result in a list of lists? – a p Mar 27 '14 at 22:31