10

I have the following string list. Then, I want to sort it by a number in each element. sorted failed because it cannot handle the order such as between 10 and 3. I can imagine if I use re, I can do it. But it is not interesting. Do you guys have nice implementation ideas? I suppose python 3.x for this code.

names = [
'Test-1.model',
'Test-4.model',
'Test-6.model',
'Test-8.model',
'Test-10.model',
'Test-20.model'
]
number_sorted = get_number_sorted(names)
print(number_sorted)
'Test-20.model'
'Test-10.model'
'Test-8.model'
'Test-6.model'
'Test-4.model'
'Test-1.model'
jpp
  • 159,742
  • 34
  • 281
  • 339
jef
  • 3,890
  • 10
  • 42
  • 76

7 Answers7

7

the key is ... the key

sorted(names, key=lambda x: int(x.partition('-')[2].partition('.')[0]))

Getting that part of the string recognized as the sort order by separating it out and transforming it to an int.

Back2Basics
  • 7,406
  • 2
  • 32
  • 45
5

Some alternatives:

(1) Slicing by position:

sorted(names, key=lambda x: int(x[5:-6]))

(2) Stripping substrings:

sorted(names, key=lambda x: int(x.replace('Test-', '').replace('.model', '')))

Or better (Pandas version >3.9):

x.removeprefix('Test-').removesuffix('.model')

(3) Splitting characters (also possible via str.partition):

sorted(names, key=lambda x: int(x.split('-')[1].split('.')[0]))

(4) Map with np.argsort on any of (1)-(3):

list(map(names.__getitem__, np.argsort([int(x[5:-6]) for x in names])))
jpp
  • 159,742
  • 34
  • 281
  • 339
  • Given the goal is to sort the original strings, not just get the sorted numbers, using a `key` function to perform the transform would make more sense (and avoid an unnecessary genexpr), e.g. for your first example, `sorted(names, key=lambda x: int(x[5:-6]))`, or for your second `sorted(names, key=lambda x: int(x.replace('Test-', '').replace('.model', '')))` – ShadowRanger Jan 24 '18 at 02:08
  • @ShadowRanger, yep I realise this now. I have edited my answer. – jpp Jan 24 '18 at 02:12
  • I like the multiple options now. That is innovative. – Back2Basics Jan 24 '18 at 02:13
  • Since 3.9, `x.removeprefix('Test-').removesuffix('.model')` might be more appropriate than the `.replace` version. [Doc for str.removeprefix and str.removesuffix](https://docs.python.org/3/library/stdtypes.html#str.removeprefix) – Stef Feb 03 '23 at 15:48
3

I found a similar question and a solution by myself. Nonalphanumeric list order from os.listdir() in Python

import re
def sorted_alphanumeric(data):
    convert = lambda text: int(text) if text.isdigit() else text.lower()
    alphanum_key = lambda key: [ convert(c) for c in re.split('([0-9]+)', key) ] 
    return sorted(data, key=alphanum_key, reverse=True)
jef
  • 3,890
  • 10
  • 42
  • 76
2

You can use re.findall in with the key of the sort function:

import re
names = [
 'Test-1.model',
 'Test-4.model',
 'Test-6.model',
 'Test-8.model',
 'Test-10.model',
 'Test-20.model'
]
final_data = sorted(names, key=lambda x:int(re.findall('(?<=Test-)\d+', x)[0]), reverse=True)

Output:

['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
1

Here is a regex based approach. We can extract the test number from the string, cast to int, and then sort by that.

import re

def grp(txt): 
    s = re.search(r'Test-(\d+)\.model', txt, re.IGNORECASE)
    if s:
        return int(s.group(1))
    else:
        return float('-inf')  # Sorts non-matching strings ahead of matching strings

names.sort(key=grp)
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • This still sorts string style (lexicographically), not numerically. You'd want to return `int(s.group(1))` in the first case, and some filler numerical value (e.g. `float('-inf')` to sort put strings not matching the pattern at the front of the resulting `list`), not `str`, in the `else` case. – ShadowRanger Jan 24 '18 at 02:10
  • @ShadowRanger No, even [making those changes](http://rextester.com/TKKP68517) still doesn't fix it. I don't know Python, by the way. Feel free to edit this. – Tim Biegeleisen Jan 24 '18 at 02:11
  • @TimBiegeleisen: `list.sort` runs in place and returns `None` (which means "has no return value"). Your test code reassigns `names` to `None` by assigning the result of `names.sort`, which is why it breaks. I removed the `names = ` from `names = names.sort(key=lambda l: grp(l))` (and simplified to `names.sort(key=grp)`; no `lambda` wrapper needed since `grp` already has the correct prototype) and [it works fine](http://rextester.com/IQNKL87746). – ShadowRanger Jan 24 '18 at 02:15
1
def find_between( s, first, last ):
    try:
        start = s.index( first ) + len( first )
        end = s.index( last, start )
        return s[start:end]
    except ValueError:
        return ""

and then do something like

 sorted(names, key=lambda x: int(find_between(x, 'Test-', '.model')))
Claudiordgz
  • 3,023
  • 1
  • 21
  • 48
1

You can use the key parameter along with sorted() to accomplish this, assuming each string is formatted the same way:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]))

It looks like you might want your list reverse sorted (?), in which case you can add reverse=True as such:

def get_number_sorted(somelist):
    return sorted(somelist, key=lambda x: int(x.split('.')[0].split('-')[1]), reverse=True)
number_sorted = get_number_sorted(names)
print(number_sorted)
['Test-20.model', 'Test-10.model', 'Test-8.model', 'Test-6.model', 'Test-4.model', 'Test-1.model']

See related: Key Functions

x1084
  • 330
  • 1
  • 11