I have twelve sub-directories and I have a python code to be run for each sub-directory. If I run it one by one it takes long time. So, I want to run the same code on all twelve sub-directories simultaneously.
My computer has two physical CPUs, each of 12 cores.
I tried the code as follows.
Each subdirectory has several files to be processed.
The following code works for each sub_directory one by one.
import os, glob
from concurrent.futures import ThreadPoolExecutor
wd = "/data0/"
sub_directories = [wd + "jan/", wd + "feb/",wd + "mar/",wd + "apr/",wd + "may/",wd + "jun/",wd + "jul/",wd + "aug/",wd + "sep/",wd + "oct/",wd + "nov/",wd + "dec/"]
for sub_directory in sub_directories:
files = glob.glob (sub_dir + "*.txt")
def processor (file):
data = np.fromfile(file)
#do some calculations
result = np.array()
np.save(result, file + ".npy")
with ThreadPoolExecutor(max_workers=5) as tpe:
tpe.map(processer, files)
How can I run this code on all twelve sub-directories simultaneously?