so I have 2 directories with 2 different file types (eg .csv, .png) but with the same basename (eg 1001_12_15.csv, 1001_12_15.png). I have many thousands of files in each directory.
What I want to do is to get the full paths of files, after having matched the basenames and then DO something with th efull path of both files.
I am asking some help of how to speed up the procedure.
My approach is:
csvList=[a list with the full path of each .csv file]
pngList=[a list with the full path of each .png file]
for i in range(0,len(csvlist)):
csv_base = os.path.basename(csvList[i])
#eg 1001
csv_id = os.path.splitext(fits_base)[0].split("_")[0]
for j in range(0, len(pngList)):
png_base = os.path.basename(pngList[j])
png_id = os.path.splitext(png_base)[0].split("_")[0]
if float(png_id) == float(csv_id):
DO SOMETHING
more over I tried fnmatch something like:
for csv_file in csvList:
try:
csv_base = os.path.basename(csv_file)
csv_id = os.path.splitext(csv_base)[0].split("_")[0]
rel_path = "/path/to/file"
pattern = "*" + csv_id + "*.png"
reg_match = fnmatch.filter(pngList, pattern)
reg_match=" ".join(str(x) for x in reg_match)
if reg_match:
DO something
It seems that using the nested for loops is faster. But I want it to be even faster. Are there any other approaches that I could speed up my code?