2

Looking to sort a set of .csv numeric values column-wise. Optionally, the number of columns varies. For example using Python:

        print(sorted(['9,11', '70,10', '10,8,1','10,70']))

produces

        ['10,70', '10,8,1', '70,10', '9,11']

while the desired result is

        ['9,11', '10,8,1', '10,70', '70,10']

First, sort by the first column, then by the second, etc.

Obviously this can be done, but can this be done elegantly?

  • First, iterate once, parse the strings and get the first number as their weights. Create a class with fields like `weight` and `value`. While iterating, create new variables with this class and generate a new list with them. Then sort those objects according to their weights, finally iterate once and collect sorted values. – Bahadir Tasdemir Feb 26 '17 at 17:50

2 Answers2

3

It can be done more elegantly by using the key argument of sorted:

data = [
    '9,11',
    '70,10',
    '10,8,1',
    '10,70'
]

print sorted(data, key=lambda s: map(int, s.split(',')))

Result:

['9,11', '10,8,1', '10,70', '70,10']

With the above code we convert each string of our list to a list of integer values and we use this list of integer values as our sorting key

JuniorCompressor
  • 19,631
  • 4
  • 30
  • 57
  • yep, and in Python 3 apparently sorted(data, key=lambda s : list(map(int, s.split(',')))) is required. Any way to do this if data is a column in a dataframe? – Tnatsissa H Craeser Feb 27 '17 at 00:45
1

If you don't mind third-party modules, you can use natsort, which is provides the function natsorted which is designed to be a drop-in replacement of sorted.

>>>> import natsort
>>> natsort.natsorted(['9,11', '70,10', '10,8,1','10,70'])
['9,11', '10,8,1', '10,70', '70,10']

Full disclosure, I am the package's author.

SethMMorton
  • 45,752
  • 12
  • 65
  • 86