I am trying to get the paths of a group of files that I have in a list. The files are in different subfolders. I am using os.walk
and loops to run through the different files and appending the complete path to a new dataframe to use in a different program. But there is an error in the code that only makes it run the first cycle of the loop.
the code is based on this thread: Need the path for particular files using os.walk()
I am using python3.6
on MacOS10.14.6
I am not sure if it matters but the directories are on an external hard drive.
import pandas as pd
import os
dir = "/Volumes/dir1/dir2"
fastafiles = ["file1", "file2", "file3"]
fastafiles_df = pd.DataFrame(fastafiles)
fasta_paths = []
for fasta in fastafiles_df[0]:
#1
for dir, subdirs, files in os.walk(dir):
for file in files:
if file.endswith(fasta):
#2
fasta_paths.append(os.path.join(dir, file))
#3
Running the code will give me 1 entry in fasta_paths
with only the path of the first file.
If I print(fasta)
at #1 I get all 3 file names from my dataframe.
If I print(file)
at #2 I will get only 1 file name
and if I print fasta_paths
at #3 I will get the path of the first file.
Could someone point out why the loop does not continue.