I have multiple (25k) .csv files that I'm trying to append into a HDFStore file. They all share identical headers. I am using the below code, but for some reason whenever I run it the dataframe isn't appended with all of the files, but rather is only the last file in the list.
filenames = [] #list of .csv file paths that I've alredy populated
dtypes= {dict of datatypes}
store = pd.HDFStore('store.h5')
store.put('df', pd.read_csv(filenames[0],dtype=dtypes,parse_dates=
["date"])) #store one data frame
for f in filenames:
try:
temp_csv = pd.DataFrame()
temp_csv = pd.read_csv(f,dtype=dtypes,parse_dates=["trade_date"])
store.append('df', temp_csv)
except:
pass
I've tried using a subset of the filenames list, but always get the last entry. For some reason, the loop is not appending my file, but rather overwriting it every single time. Any advice would be appreciated as this is driving me bonkers. (python 3, windows)