1

I have some Python Pandas code to read in multiple files from the same folder. I would like to include the individual file names too.My code, which works fine, is pulling in all the data I need. However I need to include the filename as a separate column too. I'm having a bit of a brain freeze here. Can anyone help? ;

import pandas as pd
import glob

path =r'C:\my_file_path\Misc' 
allFiles = glob.glob(path + "/*.csv")

list_ = []
for file_ in allFiles:
    df = pd.read_csv(file_, index_col=None, dtype=str, header=0)
    list_.append(df)
frame = pd.concat(list_, axis=0, irnore_index = TRUE)
cs95
  • 379,657
  • 97
  • 704
  • 746
Bryan_UK
  • 21
  • 1
  • 3
  • 2
    Modify your read_csv line: `df = pd.read_csv(...).assign(filename=file_)` – cs95 Jan 09 '19 at 15:32
  • 2
    seems a waste to add a whole column just to have a reference to a file name, why not just have a dict where the filename is the key, and the values are the dfs? Or add it as an attribute e.g. `df.filename = file_` however, attributes won't copy or persist if you write to csv – EdChum Jan 09 '19 at 15:34
  • Thanks for the prompt response. This worked perfectly. – Bryan_UK Jan 09 '19 at 15:47

0 Answers0