Python double sort with split string

Question

I am trying to double sort something and it seems to be forgetting the first sort, I thought python uses stable sort so I am probably making a mistake.

Original text is an array that looks like this:

benzene - 30.0 - 15
xylene - 5.0 - 10
benzene - 8.5 - 29
benzene - 0.5 - 11

I want:

benzene - 0.5 - 11
benzene - 8.5 - 29
benzene - 30.0 - 15
xylene - 5.0 - 10

Here is my code:

def akey(a):
    z = a.split(' -')
    v = [z[0]]
    x = [str(i) for i in v]
    return x

def bkey(b):
    z = b.split(' -')
    v = [z[1]]
    x = [float(i) for i in v]
    return x

labelList.sort(key=akey)
labelList.sort(key=bkey)

Thanks for the help

That's not what stable sort means. 1.2 is smaller than 1.3, so `BAC - 1.2 - 10` will be sorted ahead of `ABC - 1.3 - 29` — NullUserException, Sep 12 '13 at 23:48
Stable sort means that any elements that have the same key value will maintain their order relative to each other during a single sort operation. The order is not guaranteed over distinct sorts. — Asad Saeeduddin, Sep 12 '13 at 23:52

Caleb Hattingh · Answer 1 · 2013-09-13T00:34:12.500

As @NullUserException said, you can't do the sort in two steps, because the second step will reshuffle based only on the middle column, ignoring the first (str) column.

You can do the sorting in one shot after transforming the data appropriately, and you don't have to worry about keys:

s='''ABC - 0.2 - 15
BAC - 1.2 - 10
ABC - 1.3 - 29
ABC - 0.7 - 11'''

data = s.split('\n')

data
Out[5]: ['ABC - 0.2 - 15', 'BAC - 1.2 - 10', 'ABC - 1.3 - 29', 'ABC - 0.7 - 11']

newdata = [(i[0],float(i[1]),i[2]) for i in [k.split(' - ') for k in data]]

newdata
Out[10]: 
[('ABC', 0.2, '15'),
 ('BAC', 1.2, '10'),
 ('ABC', 1.3, '29'),
 ('ABC', 0.7, '11')]

sorted(newdata)
Out[11]: 
[('ABC', 0.2, '15'),
 ('ABC', 0.7, '11'),
 ('ABC', 1.3, '29'),
 ('BAC', 1.2, '10')]

Another approach: using a lambda key may be the easier way to go if the input data restructure requires a lot of manipulation:

# data is a list of strings
data = ['ABC - 0.2 - 15', 'BAC - 1.2 - 10', 'ABC - 1.3 - 29', 'ABC - 0.7 - 11']

# My key is now a lambda function that extracts the 
# sortable parts, and assembles them in a tuple.
# Note how easy it would be to change the sort order,
# just order the entries in the inner tuple differently.
# If data is some other array-like structure, just change
# how the inner data is accessed when building your tuple.

sorted(data, key=lambda x: (x.split(' - ')[0], float(x.split(' - ')[1])))
Out[18]: ['ABC - 0.2 - 15', 'ABC - 0.7 - 11', 'ABC - 1.3 - 29', 'BAC - 1.2 - 10']

The problem is that my original data is an array so I can't run .split on it. Otherwise your solution would work — user2774582, Sep 13 '13 at 00:09
If you update your question with the exact form of your data (give, say, four elements like you currently have) I will update my answer. — Caleb Hattingh, Sep 13 '13 at 00:18
When you say "array", do you mean `list` or `array.array` or `numpy.ndarray` or "lines in a file"? — Caleb Hattingh, Sep 13 '13 at 00:28

score 1 · Accepted Answer · answered Sep 12 '13 at 23:49

1

WHy don't you try putting first the bkey and then akey

Basically you have 2 priorities to sort them... your left-most has more priority. So if you start sorting from the right, you will get the result you want.

answered Sep 12 '13 at 23:49

Ricardo Mogg

589
6
18

Tried that already, just flips the order and forgets the first sort – user2774582 Sep 13 '13 at 00:06
@user2774582 This should actually work. Sort first by bkey, then by akey. – flornquake Sep 13 '13 at 00:19
Ah! Figured out the problem. This actually does work, but the reason why I thought it didn't work when I first tried is because it sorts "benzene" separately from "BENZENE" which is why I assumed it wasn't working. I also couldn't spot the issue since my final output was in all caps. Thanks guys, original code does work. – user2774582 Sep 13 '13 at 02:13

score 0 · Answer 3 · answered Sep 12 '13 at 23:51

You can convert every string to a list by splitting and then sort these lists in the standard way:

>>> l = ["ABC - 0.2 - 15", "BAC - 1.2 - 10", "ABC - 1.3 - 29", "ABC - 0.7 - 11"]
>>> l.sort(key=lambda x: x.split(' - ')[:2])
>>> l
['ABC - 0.2 - 15', 'ABC - 0.7 - 11', 'ABC - 1.3 - 29', 'BAC - 1.2 - 10']

Python double sort with split string

3 Answers3