1

I am trying to apply the same treatment to bunch of pandas dataframes.

As these dataframes are big, I don't have enough memory to load them all in the same time. So I have a list with their respective locations and I want to load and analyze them one by one.

However, with each iteration, more and more memory is used. I guess the dataframes are not deleted in the end of the iteration. I don't know how to fix it.

Here is my code:

folder = 'my/folder'
colors = ['b', 'r']

for i, f in enumerate(glob.glob(folder+'*.txt')):
    print(f)
    df = pd.read_table(f, index_col=False, header=None, delimiter="\t", names=['chr', 'x', 'y'])
    plt.figure(figsize=(32, 8))
    for j, chrm in enumerate(df.chr.unique()):
        plt.plot(df.loc[df.chr == chrm].x, df.loc[df.chr == chrm].y, label=chrm, color=colors[j])
    plt.ylim(0, 200)
    plt.legend()

I must add that I work in Spyder.

So far, I have tried:

  • to add del df and df=None in the end of the loop
  • to turn the for-loop into a function and to call the map function on it
  • to usegc.collect() function from the gc package in the end of the loop

Does somebody know how to delete my df in the end of the iteration or an alternative solution ?

Thanks a lot.

Anoikis
  • 105
  • 1
  • 13
  • 2
    I don't believe the memory issue is related to the dataframes, but rather your charts. Are you closing your figures? Try testing with no charts but the same dataframe loop and see if you still have the issue. – Alexander Sep 17 '18 at 17:30
  • You were right, that was the problem. I was not trying to delete the good object, my bad. Thanks a lot ! – Anoikis Sep 18 '18 at 07:31
  • Does this answer your question? [How to delete multiple pandas (python) dataframes from memory to save RAM?](https://stackoverflow.com/questions/32247643/how-to-delete-multiple-pandas-python-dataframes-from-memory-to-save-ram) – Georgy Nov 06 '19 at 14:30

1 Answers1

0

del statement will just delete the name. You will have to manually Garbage collection to delete the data frames from memory. Try this:

import gc
gc.collect()
canovasjm
  • 501
  • 1
  • 3
  • 11