2

I am trying to create a .txt file that reads all the filenames in a given directory and outputs the names without file extension onto a new .txt file. For ex: if my /images directory has img_1.jpg, img_2.jpg, img_4.jpg, img_4.jpg then this script will create a txt file with having img1, img2, img3, img4 I have a script that does a good job at creating the txt file, but the order in which the names are output is different from the order in which the images in my directory are placed.

import os    
files_no_ext = [".".join(f.split(".")[:-1]) for f in os.listdir() if os.path.isfile(f)]

with open('trainval.txt', 'w') as f:
    for s in files_no_ext:
        f.write("%s\n" % s)

the image directory has images in the following order:

banana-0.jpg
banana-1.jpg
banana-2.jpg
.
.
banana-1000.jpg

But the output in the .txt file is:

banana-0
banana-1
banana-10
banana-100
banana-1000
banana-1001
banana-1002
banana-1003
banana-1004
banana-1005
banana-1006
banana-1007
banana-1008
banana-1009

How can I ensure that the output is in the same order as the order of images in the directory?

Veejay
  • 515
  • 3
  • 7
  • 20
  • why do they need to be in the same order? – SuperStew Nov 16 '17 at 19:57
  • 4
    How do you know they "the image directory has images in the following order"? What tool do you use to display that list? – Robᵩ Nov 16 '17 at 19:59
  • The order the items display in your directory can change depending on how you tell your file explorer to sort them - is there one particular way that you want to sort them in the text file (e.g. alphabetical), or do you want them to *always* show up in the same order as the directory, no matter how you sort them in the directory? – Alex von Brandenfels Nov 16 '17 at 20:03
  • Answered, hope it was helpful! – alexisdevarennes Nov 16 '17 at 20:05
  • I think that it's highly likely that the "order" you are seeing is just what windows is showing you. For instance, if you created the files in that order, and you have it set to view by date, windows will *show* you in that order, but that doesn't mean that they're actually in that order. – Acccumulation Nov 16 '17 at 20:09
  • 1
    Related (possibly even duplicate): https://stackoverflow.com/questions/4836710/does-python-have-a-built-in-function-for-string-natural-sort – John Y Nov 16 '17 at 20:10
  • `for i in range(1000)`? – Xantium Nov 16 '17 at 20:14

1 Answers1

2

listdir() preserves the order of your file system.

See:

Python - order of os.listdir

Nonalphanumeric list order from os.listdir() in Python

To sort your list based the numeric value in the file names:

Use sorted and regex to sort on the numeric value of the file name.

import re
import os  

files_no_ext = [".".join(f.split(".")[:-1]) for f in os.listdir() if os.path.isfile(f)]

files_no_ext = sorted(files_no_ext , key=lambda x: (x, int(re.sub('\D','', x)), x))

with open('trainval.txt', 'w') as f:
    for s in files_no_ext:
        f.write("%s\n" % s)
   # Could also do: f.write("\n".join(files_no_ext))

Using mode="a" on open() is also probably better (mode="a" means append)

alexisdevarennes
  • 5,437
  • 4
  • 24
  • 38
  • This looks satisfactory if OP is confident that all of his filenames will be similar except for their digits. But it may not work perfectly on more varied inputs. For instance, with the input `files_no_ext = ["apple-2.txt", "banana-1.txt"]`, the sorted result will put banana-1.txt first. – Kevin Nov 16 '17 at 20:09
  • 1
    Isnt that what he wants though? to have order 1,2,3,4,5,6... ? – alexisdevarennes Nov 16 '17 at 20:10
  • updated, passing a tuple with x and also converting the output of re to int i.e. (str, int) so it also sorts it alphabetically in addition – alexisdevarennes Nov 16 '17 at 20:13
  • 1
    This was really helpful. Thanks a lot :) – Veejay Nov 16 '17 at 20:13
  • "Isnt that what he wants though?" I'm interpreting his question as "how do I get file ordering identical to the ordering my OS has?". If he's on Windows like I am, then apple-2.txt comes before banana-1.txt in the file explorer. – Kevin Nov 16 '17 at 20:14
  • Sure, but that would already be answered by the fact that listdir() returns the order of the file system, thats the first thing mentioned. – alexisdevarennes Nov 16 '17 at 20:15
  • Next is answered how he can return it in an ordered manner according to the numeric value of the filenames, even if its not explicitly stated in the question its pretty clear that is what OP wants – alexisdevarennes Nov 16 '17 at 20:16
  • @Veejay glad I could help ! Welcome to SO! – alexisdevarennes Nov 16 '17 at 20:17
  • "listdir() returns the order of the file system". Strange, that's not what I'm seeing. Explorer shows the contents of my folder as apple-3.txt, banana-1.txt, banana-2.txt, banana-10.txt; while os.listdir returns `['apple-3.txt', 'banana-1.txt', 'banana-10.txt', 'banana-2.txt']`. (Note that all of this is purely academic, since the OP is satisfied with your solution; I just think it's interesting to talk about) – Kevin Nov 16 '17 at 20:18
  • @Kevin Please see https://stackoverflow.com/questions/37245921/python-order-of-os-listdir – alexisdevarennes Nov 16 '17 at 20:19
  • Wish you a nice day! Please continue debate in that question / post ! – alexisdevarennes Nov 16 '17 at 20:20
  • @Kevin If you want to further entertain yourself please read https://stackoverflow.com/questions/4813061/nonalphanumeric-list-order-from-os-listdir-in-python also – alexisdevarennes Nov 16 '17 at 20:20
  • @Kevin also Explorer probably does some sorting and im pretty sure it does not show you the order of fs. I guess a better way would be to check the folder using a command line – alexisdevarennes Nov 16 '17 at 20:22
  • Hmm, `dir` from the command line does indeed show a different order than file explorer. I still think it would be interesting to replicate explorer's not-quite-lexicographic ordering, but we probably can't hash it out entirely in this comment thread ;-) – Kevin Nov 16 '17 at 20:34
  • Hehe no, I don't think so. I dont know the ordering of explorer, but notice I updated my answer so that it also sorts alphabetically and not only on the numeric value. Wish you a nice day Kevin! – alexisdevarennes Nov 16 '17 at 20:36