I have a dataframe which I am currently splitting into groups and then looping through it:
dfs = [group for _, group in df.groupby(by=group)]
df = dfs[0]
for i in range(len(dfs)):
if i > 0:
df = df.merge(dfs[i], how='outer', on=col_data)
del dfs[i]
Which works fine for a small number of dataframes on the list, but gets quite slow and even crashes because of memory for larger lists. I tried deleting each item as I go through but no success.
For example, I have a list of 643 dfs.
Maybe pandas is not the best solution and I should be using Numpy instead, but I am not quite sure how to do so.