0

I am looking for a more optimized way to move files to folders. Currently I have 600k files and would like to split them into separate folders of 40k sized chunks. Bellow is the method I am currently using however it looks like it will take a few days to complete. Any help you can provide would be greatly appreciated.

import os, glob, shutil
os.chdir('filepath')
list_of_file = os.listdir()
#split list into 40k sized chunks
chunks = [list_of_files[x: x + 40000] for x in range(0, len(list_of_files), 40000)]
#make new folders for files
for x in range(0,16):
     os.mkdir('file path' + str(x))
#move files to folders
for x in range(0,16):
     for i in chunks[x]:
         if i in os.listdir():
             shutil.copy(os.path.join(i), 'file path' + str(x))
martineau
  • 119,623
  • 25
  • 170
  • 301
  • 1
    I suspect your script is I/O bound, meaning it's spending most of its time copying files — you should profile it to verify this assumption, see [How can you profile a Python script?](https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script) If that's the case, there is very little you can do to speed it up other than using different hardware. – martineau Apr 25 '21 at 23:32
  • I would look at the [shutil.copytree](https://docs.python.org/3/library/shutil.html#shutil.copytree) function. It would have a lot of under-the-hood optimisations that would make it faster than copying each file individually. – cactus Apr 26 '21 at 00:28
  • How long does it takes? Do you have an SSD or a HDD? The speed of the operation is likely bounded by both the number of IOps of your storage device and the filesystem you use as well as the OS (due to possible caching). – Jérôme Richard Apr 26 '21 at 08:22

0 Answers0