1

I'm using MySQLdb under Python32 on Windows 7:

Python 3.2.3 (default, Apr 11 2012, 07:12:16) [MSC v.1500 64 bit (AMD64)] on win32
>>> import MySQLdb as My
>>> My.version_info
(1, 2, 3, 'final', 0)

I'm running service which calls this many times over and over and over again:

cursor = self._connection._conn.cursor()
cursor.execute(sql)
for i in cursor.fetchall(): pass # Operation that is not important
cursor.close()
gc.collect()
return set() # Will be filled with DB data

And memory usage just goes up and up and up, I've already tried diagnosing it and end up with this:

83    23.129 MB     0.000 MB           cursor = self._connection._conn.cursor()
84    23.129 MB     0.000 MB           cursor.execute(sql)
85    23.137 MB     0.008 MB           for i in cursor.fetchall(): pass
86    23.137 MB     0.000 MB           cursor.close()
87
88    23.137 MB     0.000 MB           gc.collect()
89
90    23.137 MB     0.000 MB           return set()

Neither __iter__ API seems to be better:

84    23.145 MB     0.000 MB           cursor.execute(sql)
85    23.145 MB     0.000 MB           for i in cursor: pass
86    23.152 MB     0.008 MB           cursor.close()
87
88    23.152 MB     0.000 MB           gc.collect()
89
90    23.152 MB     0.000 MB           return set()

And neither looping manually with fetchone():

84    23.141 MB     0.000 MB           cursor.execute(sql)
85    23.141 MB     0.000 MB           while True:
86    23.141 MB     0.000 MB               row = cursor.fetchone()
87    23.141 MB     0.000 MB               if not row:
88    23.141 MB     0.000 MB                   break
89    23.148 MB     0.008 MB           cursor.close()
90
91    23.148 MB     0.000 MB           gc.collect()
92
93    23.148 MB     0.000 MB           return set()

So why is not memory cleaning back to 23.129MB (why it always uses new 8KB)? Is the cursor buggy? Am I doing something wrong?

Vyktor
  • 20,559
  • 6
  • 64
  • 96
  • The [GC](http://docs.python.org/3.2/library/gc.html) module has a debug interface for looking into memory leaks, does that give you any more insight? – dwxw Oct 09 '13 at 09:25
  • @dwxw It gives me that much info I'm unable to interpret it, if you can provide any good resource for diagnosing this using `gc`, I'll be glad to read it. – Vyktor Oct 09 '13 at 09:36

1 Answers1

2

IIRC cursor.fetchall() builds an in-memory list of rows, and since memory allocation is costly Python tends to retains memory already allocated. Try iterating over your cursor instead, ie for row in cursor: do_something_with(row).

bruno desthuilliers
  • 75,974
  • 6
  • 88
  • 118
  • I was already writing down edit with different methods. Always the same result... `+8KB` – Vyktor Oct 09 '13 at 09:19
  • Plus `gc.collect()` should take care of that, shouldn't it? – Vyktor Oct 09 '13 at 09:36
  • I notice that in the two methods not using `cursor.fetchall()` the memory usage increases on the call to `cursor.close()`... You should perhaps 1/ set the pass statement on it's own line and 2/ add some noop statement just before the call to `cursor.close()`. You may also read http://stackoverflow.com/questions/9617001/python-garbage-collection-fails and http://stackoverflow.com/questions/1316767/how-can-i-explicitly-free-memory-in-python/ for more on the topic. Finally note that virtual memory management is a complex topic... But I assume you already know this – bruno desthuilliers Oct 09 '13 at 10:42
  • Ho and yes: if the memory is used by the `cursor` object itself, `gc.collect()` _won't_ collect it as you still have a reference to it in the current namespace. Adding a `del cursor` between `cursor.close()` and `gc.collect()` might be a good idea here ;) – bruno desthuilliers Oct 09 '13 at 10:44