1

I am trying to sort a python list using sorted method as per the code below. However the sorting is not happening properly.

#sort using the number part of the string
mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
def func(elem):
    return elem.split('-')[1].split('.')[0]

sortlist = sorted(mylist,key=func)
for i in sortlist:
  print(i)

The output is-
XYZ-18.txt
XYZ-78.txt
XYZ-8.txt

I was expecting output as- 
XYZ-8.txt
XYZ-18.txt
XYZ-78.txt
Amit Rastogi
  • 926
  • 2
  • 12
  • 22
  • 2
    Properly and how you wish it to be are 2 different things – Rolf of Saxony May 31 '18 at 07:43
  • 2
    Possible duplicate of [Does Python have a built in function for string natural sort?](https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort) – zvone May 31 '18 at 07:45
  • @RolfofSaxony updated my question with expected output. – Amit Rastogi May 31 '18 at 07:47
  • Try `sorted(d, key=lambda x: int(x.split('-')[1].split('.')[0]))` – Sohaib Farooqi May 31 '18 at 07:47
  • You are expecting `"8"` to sort before `"18"` but those are strings and so sort alphabetically. Suppose they were letters, 1=A etc. You would expect `"AH"` to sort before `"H"`. If you want them to be sorted as numbers you need to convert them to integers as in Rakesh's example. But you need to be sure that the second element of the code can always be converted to an `int`. If not, you have to trap for that. – BoarGules May 31 '18 at 07:47

5 Answers5

3

you should transform the numbers in Integers

#sort using the number part of the string
mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
def func(elem):
    return int(elem.split('-')[1].split('.')[0])

sortlist = sorted(mylist,key=func)
for i in sortlist:
  print(i)

what you see is the ordering based on the ASCII's value's cipher

Gsk
  • 2,929
  • 5
  • 22
  • 29
2

encapsulate the variable with int.

Ex:

mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
print(sorted(mylist, key=lambda x: int(x.split("-")[-1].split(".")[0])))

Output:

['XYZ-8.txt', 'XYZ-18.txt', 'XYZ-78.txt']
Rakesh
  • 81,458
  • 17
  • 76
  • 113
1

With str methods:

mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt']
result = sorted(mylist, key=lambda x: int(x[x.index('-')+1:].replace('.txt', '')))

print(result)

The output:

['XYZ-8.txt', 'XYZ-18.txt', 'XYZ-78.txt']
RomanPerekhrest
  • 88,541
  • 4
  • 65
  • 105
1

Use this code for sorting the list of strings numerically (which is needed) instead of sorting it in lexographically (which is taking place in the given code).

#sort using the number part of the string
mylist = ['XYZ-78.txt', 'XYZ-8.txt', 'XYZ-18.txt'] 
def func(elem):
    return elem[elem.index('-')+1:len(elem)-5]
sortlist = sorted(mylist,key=func)
for i in sortlist: 
    print(i) 
cs95
  • 379,657
  • 97
  • 704
  • 746
noobron
  • 41
  • 1
  • 8
0

There is a generic approach to this problem called human readable sort or with the more popular name alphanum sort which basically sort things in a way humans expect it to appear.

import re
mylist = ['XYZ78.txt', 'XYZ8.txt', 'XYZ18.txt'] 

def tryint(s):
    try:
        return int(s)
    except:
        return s

def alphanum_key(s):
    """ Turn a string into a list of string and number chunks.
        "z23a" -> ["z", 23, "a"]
    """
    return [ tryint(c) for c in re.split('([0-9]+)', s) ]

def sort_nicely(l):
    """ Sort the given list in the way that humans expect.
    """

l.sort(key=alphanum_key)
['XYZ-8.txt', 'XYZ-18.txt', 'XYZ-78.txt']

That will work on any string, don't have to split and cut chars to extract a sort-able field.

Good read about alphanum: http://www.davekoelle.com/alphanum.html

Original Source code: https://nedbatchelder.com/blog/200712/human_sorting.html

user1767754
  • 23,311
  • 18
  • 141
  • 164