6

I have a few workers that is listening to an RabbitMQ queue and that is doing some disk I/O intensive work - opening ~18MB files, doing some parsing and writes to some files. While processing one job a worker could take up to 200MB of memory.. and this is fine.

However, my problem is that the worker continues to be idle and still reserving this amount of memory. I have blindly tried to do some garbage collection manually with gc.collect() after the job is done but without any results.

My worker class that receives the job looks like this:

class BuildWorker(worker.Worker):

    def callback(self, ch, method, properties, body):
        fp = FileParseAndStuff()
        fp.execute_job(ch, method, properties, body)
        fp = None

Shouldn't everything that happens inside fp here be contained memory wise and be removed once I set that object to None ? I have tried to use Python's del statement as well but without any improvments.

I'm using Python 2.7 and python-pika to communicute with the RabbitMQ server, if that matters.

Niklas9
  • 8,816
  • 8
  • 37
  • 60
  • You can try to find out which objects take this space, using [objgraph](http://mg.pov.lt/objgraph/). – fjarri Sep 29 '13 at 10:15
  • See this http://stackoverflow.com/questions/11957539/python-memory-not-being-given-back-to-kernel – Maciej Gol Sep 29 '13 at 10:24
  • I'm having the same problem (although with Python 3.6). I think my worker is closing down everything, the channel, the DB connection, etc. But over time, the system runs out of memory. – John C Jan 21 '20 at 02:30
  • 1
    The Problem could be in FileParseAndStuff but without code nobody can say – Lee Jan 22 '20 at 09:02

1 Answers1

0

put a flag before fp = None, like write to a file "done" or print to console. Your worker may not be there yet when you call del or gc.collect(). If it is, check your execute_job method.

cox
  • 731
  • 5
  • 12
  • I don't get this, I'm not using any asynchronous framework for Python.. so fp.execute_job() is completed before the next line (fp = None) is executed.. so I don't see how this would help.. – Niklas9 Sep 29 '13 at 11:32
  • try to pass an dummy job to worker, maybe the reference is kept by worker itself and released when start a new job. – cox Sep 29 '13 at 11:44
  • no it's not unfortunately, the memory consumption is growing with the same amount for each job :( – Niklas9 Sep 29 '13 at 13:46