-1

I am looking to iterate through a set of JPG files in a folder and tie those JPG files together into a PDF. Each JPG represents an ordered page in the PDF, and therefore in order to correctly tie these JPGs into a PDF, they must be sorted appropriately when I iterate through the folder.

My JPG file structure in this folder is like the following:

filename_1.jpg
filename_2.jpg
filename_3.jpg
filename_4.jpg
filename_5.jpg
filename_6.jpg
filename_7.jpg
filename_8.jpg
filename_9.jpg
filename_10.jpg
filename_11.jpg
filename_12.jpg
filename_13.jpg
filename_14.jpg
filename_15.jpg

Where the number at the end of the filename represents the page number in the PDF.

When I do the following to test whether the files are sorted in the correct order:

for file in sorted(os.listdir(folder_path)):
    print(file)

I get the following output when the sorted function sorts the file structure:

filename_1.jpg
filename_10.jpg
filename_11.jpg
filename_12.jpg
filename_13.jpg
filename_14.jpg
filename_15.jpg
filename_2.jpg
filename_3.jpg
filename_4.jpg
filename_5.jpg
filename_6.jpg
filename_7.jpg
filename_8.jpg
filename_9.jpg

While this is in correct "sorting" order from an alphanumeric perspective, it is not in correct page order, and therefore the resulting PDF will not be sorted properly. I know if I add a zero before each of the single digit page number files, this would work properly (i.e. filename_01.jpg instead of filename_1.jpg), however I have over 8,000 jpg files across over 600 folders of jpgs, and converting all these single digit page number files in this way is not a straightforward task for me to take on.

Does anyone have a suggestion on how I can get these files to sort appropriately based on the page number at the end of the filename?

CSlater
  • 73
  • 7
  • This answers your question: [How do you sort files numerically?](https://stackoverflow.com/questions/4623446/how-do-you-sort-files-numerically) – Jongware Feb 06 '20 at 15:58

1 Answers1

1

there maybe a more efficient way to do this, but if your file names follow the format in the posted question <string>_<int>.<ext> then this would work:

files={int(file.split("_")[1].split(".")[0]) : file for file in sorted(os.listdir(folder_path))}
sorted_files=[files[file_key] for file_key in sorted(files.keys())]

basically, you create a dictionary of an int key corresponding to file number mapped to file name, you sort the keys and get the list of values from it.

Cryptoharf84
  • 371
  • 1
  • 12