I am trying to apply the same treatment to bunch of pandas dataframes.
As these dataframes are big, I don't have enough memory to load them all in the same time. So I have a list with their respective locations and I want to load and analyze them one by one.
However, with each iteration, more and more memory is used. I guess the dataframes are not deleted in the end of the iteration. I don't know how to fix it.
Here is my code:
folder = 'my/folder'
colors = ['b', 'r']
for i, f in enumerate(glob.glob(folder+'*.txt')):
print(f)
df = pd.read_table(f, index_col=False, header=None, delimiter="\t", names=['chr', 'x', 'y'])
plt.figure(figsize=(32, 8))
for j, chrm in enumerate(df.chr.unique()):
plt.plot(df.loc[df.chr == chrm].x, df.loc[df.chr == chrm].y, label=chrm, color=colors[j])
plt.ylim(0, 200)
plt.legend()
I must add that I work in Spyder.
So far, I have tried:
- to add
del df
anddf=None
in the end of the loop - to turn the for-loop into a function and to call the
map
function on it - to use
gc.collect()
function from thegc
package in the end of the loop
Does somebody know how to delete my df in the end of the iteration or an alternative solution ?
Thanks a lot.