39

EDIT: If I explicity change the backend for matplotlib from 'Qt4Agg' to just 'Agg' then I am able to run my code with no errors. I assume this is a bug in the backend?

I am writing some code for processing a fairly large amount of data automatically. The code first of all parses my data files and stores all of the relevant bits. I then have different functions for producing each of the graphs I need (there are about 25 in all). However, I keep running into some kind of memory error and I think it is because Matplotlib / PyPlot are not releasing the memory correctly.

Each plotting function ends with a pyplot.close(fig) command and since I just want to save the graphs and not look at them immediately they do not include a pyplot.show().

If I run the plotting functions individually in an interpreter then I don't get any problems. However, if I make a separate function which calls each plotting function in turn then I run into a "MemoryError: Could not allocate memory for path".

Has anyone came across a problem like this? It would seem to be related to Matplotlib runs out of memory when plotting in a loop but pyplot.close() doesn't fix my problem.

This is what a typical plot function looks like in my code:

def TypicalPlot(self, title=None, comment=False, save=False, show=True):

    if title is None:
        title = self.dat.title

    fig = plt.figure()
    host = SubplotHost(fig, 111)
    fig.add_subplot(host)
    par = host.twinx()
    host.set_xlabel("Time (hrs)")
    host.set_ylabel("Power (W)")
    par.set_ylabel("Temperature (C)")
    p1, = host.plot(self.dat.timebase1, self.dat.pwr, 'b,', label="Power",
                    markevery= self.skip)
    p2, = par.plot(self.dat.timebase2, self.dat.Temp1, 'r,', 
                   label="Temp 1", markevery= self.skip)
    p3, = par.plot(self.dat.timebase2, self.dat.Temp2, 'g,', 
                   label="Temp 2", markevery= self.skip)
    p4, = par.plot(self.dat.timebase2, self.dat.Temp3, 'm,', 
                   label="Temp 3", markevery= self.skip)
    host.axis["left"].label.set_color(p1.get_color())
    # par.axis["right"].label.set_color(p2.get_color())
    #host.legend(loc='lower left')
    plt.title(title+" Temperature")

    leg=host.legend(loc='lower left',fancybox=True)
    #leg.get_frame().set_alpha(0.5)
    frame  = leg.get_frame()
    frame.set_facecolor('0.80')

    ### make the legend text smaller
    for t in leg.get_texts():
        t.set_fontsize('small')

    ### set the legend text color to the same color as the plots for added
    ### readability
    leg.get_texts()[0].set_color(p1.get_color())
    leg.get_texts()[1].set_color(p2.get_color())
    leg.get_texts()[2].set_color(p3.get_color())    
    leg.get_texts()[3].set_color(p4.get_color())        

    if show is True and save is True:
        plt.show()
        plt.savefig('temp.png')
    elif show is True and save is False:
        plt.show()
    elif show is False and save is True:
        plt.savefig('temp.png')
        plt.clf()
        plt.close(fig)

If I now run in a terminal

MyClass.TypicalPlot(save=True, show = False) 

Then I don't get any errors. The same is true for all of my plot functions.

If I make a new function which does this:

def saveAllPlots(self, comments = False):

        if self.comment is None: comment = False
        else: comment = True
        self.TypicalPlot(save=True, show=False, comment=comment)
        self.AnotherPlot(save=True, show=False)
        self.AnotherPlot2(save=True, show=False)
        self.AnotherPlot3(save=True, show=False)
        ...etc, etc, etc

Then it runs through about half of the graphs and then I get "MemoryError: Could not allocate memory for path".

Community
  • 1
  • 1
FakeDIY
  • 1,425
  • 2
  • 14
  • 23
  • did you try clearing the figures before closing them? – ev-br May 17 '12 at 10:05
  • 3
    Try adding `del fig` to the end of the function, after and independant of the if/elif. – Roland Smith May 20 '12 at 10:19
  • `plt.clf()` doesn't help. I've also tried to clear the axes with `cla()` too but that doesn't help either. I will try `del fig` when I get a chance. – FakeDIY May 21 '12 at 15:29
  • Changing the matplotlib backend solves the problem. I have updated my original question to reflect this but I would appreciate any further information anyone has. – FakeDIY May 22 '12 at 12:39
  • 2
    I don't understad why it's happening, but in terms of fixing it how about doing an explicit garbage collection with `import gc` at the beginning and a `gc.collect()` after each loop iteration? – Ferdinand van Wyk Jun 21 '15 at 21:30
  • I had a similar problem saving several plots. In my case Matplotlib plotted data from the last plot in the following I used plt = None to solve it. I guess in your case fig = None at the end of your code could help to create a entire new figure when the TypiclaPlot is called – BigZ Oct 02 '15 at 15:03
  • 4
    Last two comments combined is what I would try. `fig = None` followed by an explicit `gc.collect()`. – RobertB Oct 09 '15 at 18:30

2 Answers2

1

I think the reason it is doing this is because as it goes through all of the different graphs it then runs out of memory probably because it isn't releasing it properly.

Why don't you try creating about 3 or so programs each of which do a few graphs instead of one program doing all the graphs:

Program 1: Graphs 1-8

Program 2: Graphs 9-16

Program 3: Graphs 17-25

Hope this helps @FakeDIY : )

Rlz
  • 1,649
  • 2
  • 13
  • 36
1

I run into a very similar problem once. I assume matplotlib keeps references for each plots internally. Given the following code, creating three separate figures:

import matplotlib.pyplot as plt
import numpy as np

# block 1
f, ax = plt.subplots(1)
plt.plot(np.arange(10), np.random.random(10))
plt.title("first")
print 'first', sys.getrefcount(f), sys.getrefcount(ax)

# bock 2
f, ax = plt.subplots(1)
plt.plot(np.arange(10), np.random.random(10)+1)
plt.title("second")
print 'second', sys.getrefcount(f), sys.getrefcount(ax)

# block 3
f, ax = plt.subplots(1)
plt.plot(np.arange(10), np.random.random(10)+2)
plt.title("third")
print 'third', sys.getrefcount(f), sys.getrefcount(ax)

plt.show()

print 'after show', sys.getrefcount(f), sys.getrefcount(ax)

Output:

first 69 26
second 69 26
third 69 26
after show 147 39

This is counter intuitive, because we redefined f and ax several times. With every block, we created a new figure, which can be referenced via plt. Creating another figure changes the topmost references accessible by plt. But there must be some internal reference, which allows plt.show() to show all figures. Those references seem to be persistent and thus the figures won't be collected by the gc.

The workaround I settled with, was changing the data of the plot. In hindsight it was a better approach anyway:

plt.ion()
f, ax = plt.subplots(1)
line = ax.plot(np.arange(10), np.random.random(10))[0]
plt.title('first')
plt.show()

for i, s in [(2, 'second'), (3, 'third')]:
    x = np.arange(10)
    y = np.random.random(10)+i
    line.set_data(x, y)
    ax.set_xlim(np.min(x), np.max(x))
    ax.set_ylim(np.min(y), np.max(y))
    plt.title(s)
    plt.draw()
    raw_input(s)

Only drawback is you have to keep the Window with the figure open. And without the raw_input the program will just run through

rikisa
  • 301
  • 2
  • 9