I have some code that iterates over DB entities, and runs in a task - see below.
On app engine I'm getting Exceeded soft private memory limit
error, and indeed checking memory_usage().current()
confirms the problem. See below for output from logging statement. It seems that every time a batch of foos is fetched the memory goes up.
My question is: why is the memory not being garbage collected? I would expect, that in each iteration of of the loops (the while
loop, and the for
loop, respectively) the re-use of the name foos
and the foo
would cause the objects to which foos
and foo
used to point would be 'de-referenced' (i.e. become inaccessible) and therefore become eligible for garbage collection, and then be garbage collected as memory gets tight. But evidently that it not happening.
from google.appengine.api.runtime import memory_usage
batch_size = 10
dict_of_results = {}
results = 0
cursor = None
while True:
foos = models.Foo.all().filter('status =', 6)
if cursor:
foos.with_cursor(cursor)
for foo in foos.run(batch_size = batch_size):
logging.debug('on result #{} used memory of {}'.format(results, memory_usage().current()))
results +=1
bar = some_module.get_bar(foo)
if bar:
try:
dict_of_results[bar.baz] += 1
except KeyError:
dict_of_results[bar.baz] = 1
if results >= batch_size:
cursor = foos.cursor()
break
else:
break
and in some_module.py
def get_bar(foo):
for bar in foo.bars:
if bar.status == 10:
return bar
return None
Output of logging.debug (shortened)
on result #1 used memory of 43
on result #2 used memory of 43
.....
on result #20 used memory of 43
on result #21 used memory of 49
.....
on result #32 used memory of 49
on result #33 used memory of 54
.....
on result #44 used memory of 54
on result #45 used memory of 59
.....
on result #55 used memory of 59
.....
.....
.....
on result #597 used memory of 284.3
Exceeded soft private memory limit of 256 MB with 313 MB after servicing 1 requests total