I'm running a routine that opens a directory, and all its sub-directories, performs some tasks, then outputs to a .csv using pandas. However, I need to establish the sub-directory name, so it can be output to the .csv too.
Accessing a single subdirectory, I can do this with:
path = r'/users/directory/sub-directory'
dataframe['sub-directory'] = os.path.basename(path)
print (dataframe)
A B C sub-directory
1 2 3 Folder-1
4 5 6 Folder-1
7 8 9 Folder-1
And the sub-directory is easily assinged with os.path.basename(path)
. However, I want to run through the directory, which works using Glob, but I lose the sub-directory names when outputting to a .csv:
path = r'/users/directory/*/' #Using Glob
dataframe['sub-directory'] = os.path.basename(path)
print (dataframe)
#Actual Output
A B C sub-directory
1 2 3 NaN
4 5 6 NaN
7 8 9 NaN
1 2 3 NaN
4 5 6 NaN
7 8 9 NaN
#Desired Output
A B C sub-directory
1 2 3 Folder-1
4 5 6 Folder-1
7 8 9 Folder-1
1 2 3 Folder-2
4 5 6 Folder-3
7 8 9 Folder 4
I've seen this answer here: Getting a list of all subdirectories in the current directory, but not sure how to integrate it into my routine.