App Engine Deferred: Tracking Down Memory Leaks

Question

We have an App Engine application that writes many files of a relatively large size to Google Cloud Store. These files are CSVs that are dynamically created, so we use Python's StringIO.StringIO as a buffer and csv.writer as the interface for writing to that buffer.

In general, our process looks like this:

# imports as needed
# (gcs is the Google Cloud Store client)

buffer = StringIO.StringIO()
writer = csv.writer(buffer)

# ...
# write some rows
# ...

data = file_buffer.getdata()
filename = 'someFilename.csv'

try:
    with gcs.open(filename, content_type='text/csv', mode='w') as file_stream:
        file_stream.write(data)
        file_stream.close()

except Exception, e:
    # handle exception
finally:
    file_buffer.close()

As we understand it, the csv.writer does not need to be closed itself. Rather, only the buffer above and the file_stream need be closed.

We run the above process in a deferred, invoked by App Engine's task queue. Ultimately, we get the following error after a few invocations of our task:

Exceeded soft private memory limit of 128 MB with 142 MB after servicing 11 requests total

Clearly, then, there is a memory leak in our application. However, if the above code is correct (which we admit may not be the case), then our only other idea is that some large amount of memory is being held through the servicing of our requests (as the error message suggests).

Thus, we are wondering if some entities are kept by App Engine during the execution of a deferred. We should also note that our CSVs are ultimately written successfully, despite these error messages.

`"Clearly, then, there is a memory leak"` - not necessarily. The app's baseline memory usage (with all the code loaded) may be higher that the one seen at startup. 128M is not that much, the default for backend instances these days is B2/256M. Peaks of parallel processed requests matter, too. Also see http://stackoverflow.com/questions/32981118/how-does-app-engine-python-manage-memory-across-requests-exceeded-soft-privat, http://stackoverflow.com/questions/31853703/google-app-engine-db-query-memory-usage, http://stackoverflow.com/questions/33036334/memory-leak-in-google-ndb-library — Dan Cornilescu, Feb 03 '16 at 23:22
I think that could constitute as an answer! Very interesting how Google's B2 instance has half the memory... — nmagerko, Feb 03 '16 at 23:24
how are you using the deferred task(s)? one at a time or pile a lot of them? Are their executions overlapping? — Dan Cornilescu, Feb 03 '16 at 23:27
They are running one after another. I just halved the amount of data we are processing with each call, and the memory errors disappeared (so your implied solution is correct) — nmagerko, Feb 03 '16 at 23:32
Sounds like now there's enough time for the garbage collector to keep up with the work. Had the same problem trying to keep up with processing peaks of uploads, I only had to pace the activity a bit. — Dan Cornilescu, Feb 04 '16 at 03:35

score 2 · Accepted Answer · edited May 23 '17 at 12:15

The symptom described isn't necessarily an indication of an application memory leak. Potential alternate explanations include:

the app's baseline memory footprint (which for the scripting-language sandboxes like python can be bigger than the footprint at the instance startup time, see Memory usage differs greatly (and strangely) between frontend and backend) may be too high for the instance class configured for the app/module. To fix - chose a higher memory instance class (which, as a side effect, also means a faster class instance). Alternatively, if the rate of instance killing due to exceeding memory limits is tolerable, just let GAE recycle the instances :)
peaks of activity, especially if multi-threaded request handling is enabled, means higher memory consumption and also potential overloading of the memory garbage collector. Limiting the number of requests performed in parallel, adding (higher) delays in lower priority deferred task processing and other similar measures reducing the average request processing rate per instance can help give the garbage collector a chance to cleanup leftovers from requests. Scalability should not be harmed (with dynamic scaling) as other instances would be started to help with the activity peak.

Related Q&As:

App Engine Deferred: Tracking Down Memory Leaks

1 Answers1

Linked