I have approx 50,000 .pkl
files, which each contain two pandas, which I want to append to two large pandas.
I tried to loop over the files, reading them in, and appending one by one which gets painfully slow (why? see here):
DF_a = pd.DataFrame
DF_b = pd.DataFrame
for appended_file in os.listdir(folderwithallfiles):
with open(appenddirectory + appended_file, 'rb') as data:
df_a, df_b = pickle.load(data)
DF_a= pd.concat([DF_a, df_a], axis = 0]
DF_b= pd.concat([DF_b, df_b], axis = 0)
As suggested in the linked post, I am trying to build a list of pandas to concatenate, but the only way I can think of doing it would be to rename the dataframes in the loop (like here), which is advised against. I do not see how I can fit them in a dictionary and concat from there. Any advice?