5

I have a fairly simple list (a number followed by a sentence), here in the right order:

-347 a negative number
-100 another negative number
-25 and again, a negative number
17 some text
25 foo bar
100 two same texts
100 two same texts (almost)
350 a positive number

I need to sort that list each time a new item is added.

I searched S.O. and found that answer :

Sorting in python - how to sort a list containing alphanumeric values?

The code I used is freakish’s, which goes :

import re
def convert(str):
    return int("".join(re.findall("\d*", str)))

list1.sort(key=convert)

In order to better explain my problem, I shuffled the list and ran the code.

The result was :

17 some text
-25 and again, a negative number
25 foo bar
100 two same texts (almost)
100 two same texts
-100 another negative number
-347 a negative number
350 a positive number

What went wrong ? What is the precision in the code that would sort naturally the negative numbers?

Community
  • 1
  • 1
ThG
  • 2,361
  • 4
  • 22
  • 33
  • 1
    `'-'` isn't a digit, so it isn't matched by `'\d'`. Consequently, your pattern match only finds full numbers. Side notes: If your input string is e.g.`'1 this is line 00001'` it will be sorted last by your key. And `''` matches `'\d*'`, but `int('')` raises `ValueError`. – dhke Feb 24 '17 at 21:08
  • You could separate the negative and positive before running, and glue them afterwards, making the negative ones reversed and at the top. – bosnjak Feb 24 '17 at 21:08
  • You could change your convert function to `return int(str.split()[0])` – fafl Feb 24 '17 at 21:10

2 Answers2

6

Sort can accept a tuple, so first sort by the number, then by the text after

list_text = """-347 a negative number
-100 another negative number
-25 and again, a negative number
17 some text
25 foo bar
100 two same texts
100 two same texts (almost)
350 a positive number"""

list1 = list_text.split("\n")

list1.sort(key=lambda x: (int(x.split()[0]), x.split(" ",1)))

print("\n".join(list1))
Keatinge
  • 4,330
  • 6
  • 25
  • 44
  • To avoid first join you can pass 1 as argument to split - it will limit number of splits – mshutov Feb 24 '17 at 21:25
  • @Keatinge first of all, thanks ; but I get an error on the [list1 = list_text.split("\n")] line : AttributeError: 'list' object has no attribute 'split'. I do not understand why (my Python skills are not worth mentioning) – ThG Feb 24 '17 at 21:30
  • @ThG Just use the last two lines, the others are for a runnable example so everyone can run it and test it – Keatinge Feb 24 '17 at 21:31
  • @Keatinge : It works. And now I am embarrassed : I have 2 good answers and I already "ticked" Murelnik's. And I do not know how to share my compliments on S.O. Thank you very much anyway. – ThG Feb 24 '17 at 22:00
3

The easiest approach, IMHO, would be to split the string to individual words, and then sort by the first word converted to an int:

list1.sort(key = lambda x : int(x.split(" ")[0]))

EDIT:
As Keatinge, the above solution won't properly handle two strings that start with the same number. Using the entire string as a secondary search term should solve this issue:

list1.sort(key = lambda x : (int(x.split(" ")[0]), x)
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • This wont handle two strings that start with the same number but contain different text afterward – Keatinge Feb 24 '17 at 21:12
  • @Keatinge good point! Using the string itself as a secondary search term should do the trick though - see my edited answer. – Mureinik Feb 24 '17 at 21:15