-1

hi I am trying to sort my files in numeric order, not sure if this is possible. i have below list of files starting with numbers.

31_test1.txt
1_test1.txt 
3_test4.txt 
2_test2.txt 
10_test3.txt 
20_test4.txt

but when i use code below it give me out put as

10_test3.txt
1_test1.txt
20_test4.txt
2_test2.txt
31_test1.txt
3_test4.txt

my desire output is as below,

1_test1.txt
2_test2.txt
3_test4.txt
10_test3.txt
20_test4.txt
31_test1.txt
def getfilename(dirName, fileext):
     #_logformat(" in get filename")
     # create a list of file and sub directories
     # names in the given directory
     listOfFile = os.listdir(dirName)
     #print(type(listOfFile))
     listOfFile.sort()
     allFiles = list()
     # Iterate over all the entries
     for entry in listOfFile:
        # Create full path
        fullPath = os.path.join(dirName, entry)
        #print(fullPath)
        # If entry is a directory then get the list of files in this directory
        if os.path.isdir(fullPath):
            allFiles = allFiles + getfilename(fullPath, fileext)
        else:
            if fileext in fullPath:
                #print(fullPath)
                allFiles.append(fullPath)

     return allFiles
fixatd
  • 1,394
  • 1
  • 11
  • 19
ronak
  • 11
  • 1
  • 4
  • What is your code for? – Shadowcoder Dec 04 '20 at 16:26
  • You are currently getting the files sorted by lexical order because that's how strings are sorted. You can sort in natural order using the `key` argument of `sort()` if you can figure out how to extract the numbers from the file names, or you can pad zeros to the left of the file names. – Pranav Hosangadi Dec 04 '20 at 16:28

5 Answers5

0

Try this:

def natural_sort(item):
    a = int(item.split('_')[0])
    #for sorting the array as per the starting number
    b = item.split('_')[1]
    #for sorting the array as per the number 
    #which is followed by 'text'
    return a,b
allFiles.sort(key=natural_sort)
print(allFiles)
Shadowcoder
  • 962
  • 1
  • 6
  • 18
  • 1
    Also also: While code-only answers might answer the question, you could significantly improve the quality of your answer by providing context for your code, a reason for why this code works, and some references to documentation for further reading. From [answer]: _"Brevity is acceptable, but fuller explanations are better."_ When you write _"Try this"_, you give the impression that programming is a trial-and-error type job where you throw things at the wall and see what sticks. Your answer would be so much better if you actually explained what was happening and how to fix it. – Pranav Hosangadi Dec 04 '20 at 16:32
  • `files = ['10_test3.txt', '1_test1.txt', '20_test4.txt', '2_test2.txt', '31_test1.txt', '3_test4.txt', '3_test2.txt']` gives the wrong answer. One would expect `3_test2.txt` to be sorted before `3_test4.txt`. – Pranav Hosangadi Dec 04 '20 at 16:34
0

This solution uses the sorted function which accepts a key argument that is the target the list is sorted after. To apply multiple sorting rules, key can be a tuple. Here, the first rule considers the leading numbers by extracting them with a regular expression. The second rule guarantees that the rest is sorted lexically.

import re

random_files = ["10_test3.txt", "2_test2.txt", "3_test4.txt", "20_test4.txt", "31_test1.txt", "1_test1.txt"]

sorted(random_files, key= lambda item: (int(re.search("(\d+)_", item).group(1)), item))
alexanderdavide
  • 1,487
  • 3
  • 14
  • 22
0
fileNames = ["31_test1.txt", "1_test1.txt", "3_test4.txt" ,"2_test2.txt", "10_test3.txt","20_test4.txt"]
getFileNumber = []
for i in fileNames :
    getFileNumber.append(int(i.split('_')[0]))

for z in sorted(getFileNumber):
    for i in fileNames :
        if (z == int(i.split('_')[0])):
            print(i)

I hope this helps, adjust this as per your program.

Pranav Hosangadi
  • 23,755
  • 7
  • 44
  • 70
0

You are currently getting the files sorted by lexical order because that's how strings are sorted. We can sort in natural order using the key argument of sort() if we can

1. Figure out how to extract the numbers from the file names

To extract the number at the start of the file name, you can split by the underscore, like Shadowcoder's answer suggests. However, this leaves the problem of sorting two files with the same initial number, such as 3_text2.txt and 3_text4.txt. To get around this, we can simply return a tuple containing the initial number and the rest of the string. That way, if the first number is equal for both strings, they are sorted on lexical order for the rest of the string.

def keyfunc(x):
    s = x.split('_', 2) # We only want the initial integer and the rest of the string.
    return (int(s[0]), s[1])

files = ['10_test3.txt', '1_test1.txt', '20_test4.txt', '2_test2.txt', '31_test1.txt', '3_test4.txt', '3_test2.txt']
files_sorted = sorted(files, key=keyfunc)

This gives us the following result, where the 3_test*.txt files have also been sorted correctly:

['1_test1.txt',
 '2_test2.txt',
 '3_test2.txt',
 '3_test4.txt',
 '10_test3.txt',
 '20_test4.txt',
 '31_test1.txt']

Alternatively, we could

2. Pad zeros to the left of the file names.

Lexical order is the same as natural order if the strings are padded to the same length with zeros. So, we could define that as our key function

def keyfunc(x):
    return x.rjust(100, '0')

files_sorted = sorted(files, key=keyfunc)

This gives the same result.

Pranav Hosangadi
  • 23,755
  • 7
  • 44
  • 70
0

This one sorted by first number(mean that before _test)

listOfFile = sorted(listOfFile,key=lambda x : int(x.split('_')[0])) 

Out:[
 '1_test1.txt',                                                                                        
 '2_test2.txt',
 '3_test2.txt',
 '3_test4.txt',
 '10_test3.txt',
 '20_test4.txt',
 '31_test1.txt'
  ]

The next case performs the sorting operation with two priorities:
First the number before a (_test) and second after (test) and before (.txt)

listOfFile = sorted(files,key=lambda x :  (int(x.split('_')[0]), int(a.split('.txt')[0][a.index('test')+4:])), reverse = True)[::-1] 



Out:[
 '1_test1.txt',
 '2_test2.txt',
 '3_test2.txt',
 '3_test4.txt',
 '10_test3.txt',
 '20_test4.txt',
 '31_test1.txt'
  ]
zankoAN
  • 327
  • 2
  • 6
  • `files = ['10_test3.txt', '1_test1.txt', '20_test4.txt', '2_test2.txt', '31_test1.txt', '3_test4.txt', '3_test2.txt']` gives the wrong answer. One would expect `3_test2.txt` to be sorted before `3_test4.txt` – Pranav Hosangadi Dec 04 '20 at 16:48
  • Yes, I thought it would be sorted by the first number, but you can use this if you want: sorted(files,key=lambda x : (int(x.split('_')[0]), int(x.split('.')[0][-1]))) – zankoAN Dec 04 '20 at 17:15