0

I'm new to Python and having some trouble looping all the files in my directory.

I am trying to import data from all Excel files from all of the subfolders I have in one single directory. For example, I have a directory named "data" which has five different subfolders and each subfolder contains Excel files from which I want to extract data.

I guess my current code is not working because it just loops all the files in a directory without considering the subfolders. How do I modify my current code to extract data from all the subfolders in my directory?

data_location = "data/"

for file in os.listdir(data_location):
    df_file = pd.read_excel(data_location + file)
    
    df_file.set_index(df_file.columns[0], inplace=True)
    selected_columns = df_file.loc["Country":"Score", :'Unnamed: 1']
    selected_columns.dropna(inplace=True)
    
    df_total = pd.concat([selected_columns, df_total], ignore_index=True)

Also, I've been trying to create a new variable using each file name as I import them. For example, if there are 5 files(file1~file5) in a directory, I want to create a new variable called "Source" and each value would be file1, file2, file3, file4, file5. I want python to append this value for the new variable as it imports each file in the loop. Could anyone please help me with this?

  • Does this answer your question? [Iterating through directories with Python](https://stackoverflow.com/questions/19587118/iterating-through-directories-with-python) – Tomerikoo Oct 27 '20 at 15:13

2 Answers2

0

to go through subdirectories recursively, try something like this:

data_location = 'C:/path/to/data'
for subdir, dirs, files in os.walk(data_location):
    for file in files:
        df_file = pd.read_excel(data_location + file)
user3150635
  • 509
  • 2
  • 9
  • 26
-1
def excels(data_location):
    files = []
    for subdir, dirs, files in os.walk(data_location):
        for file in files:
            file_path = os.path.join(subdir, file)
            if file.endswith('.xlsx') or file.endswith('.xls') or file.endswith('.xlsm'):
                files.append(file_path)
    return files

# Example usage
data_location = '/path/to/directory'
excels(data_location)
Iyad Bacdounes
  • 75
  • 1
  • 12
  • 1
    Welcome back to Stack Overflow. It looks like it's been a while since you've posted and may not be aware of the latest policies since your answer appears likely to have been entirely or partially written by AI (e.g., ChatGPT). As a heads-up, [posting of AI-generated content is not permitted on Stack Overflow](//meta.stackoverflow.com/q/421831). If you used an AI tool to assist with any answer, I would encourage you to delete it. Thanks! – NotTheDr01ds Jul 05 '23 at 11:54
  • **Readers should review this answer carefully and critically, as AI-generated information often contains fundamental errors and misinformation.** If you observe quality issues and/or have reason to believe that this answer was generated by AI, please leave feedback accordingly. The moderation team can use your help to identify quality issues. – NotTheDr01ds Jul 05 '23 at 11:54
  • sorry, I wrote the code and tested it in a live project and asked AI to explain it, I just removed the generated explanation – Iyad Bacdounes Jul 06 '23 at 12:27
  • Thanks for the reply and edit! I do think folks would welcome *your* explanation, though ;-). – NotTheDr01ds Jul 06 '23 at 19:33
  • thank you for the comment, it's just discouraging to get -1 – Iyad Bacdounes Jul 09 '23 at 08:22
  • Not my downvote, but a lot of people do downvote code-only answers (as well as those generated by AI or for many other reasons). – NotTheDr01ds Jul 09 '23 at 11:34
  • 1
    Thank you so much for your time and feedback – Iyad Bacdounes Jul 09 '23 at 14:36