11

I use GAE(google app engine), make a very simple web application. in the method of processing request, i just create a big array of objects. After that, i delete all references to the array. After that, i call gc.collect.

But when i test (send request) for a long time, the memory usage of Dashboard continue increase.

I look like memory leak. But i think the code is ok.

below is a sample code.

from flask import Flask, request

import gc

app = Flask(__name__)

@app.route('/', methods=['POST'])
def hello():

    gc.enable()

    bigArr = []
    for x in range(10000):
        raw_data = request.get_data(cache=False)
        bigArr.append(raw_data)
        del raw_data

    print('len(bigArr):' + str(len(bigArr)))
    del bigArr
    gc.collect()

    return 'Hello World'


if __name__ == '__main__':
    app.run(host='0.0.0.0', port=80, debug=True)

App engine config: runtime: python37 automatic_scaling: max_instances: 1

Here's the image of memory usage: Please View the image of memory usage

new name
  • 15,861
  • 19
  • 68
  • 114
  • Potentially related: https://stackoverflow.com/a/45061554/4495081 – Dan Cornilescu Jun 04 '19 at 10:53
  • I would like to add more information, the config is as following:runtime: python37 automatic_scaling: max_instances: 1 –  Jun 04 '19 at 14:13
  • 2
    Many people have this issue, but I've never seen a satisfactory explanation of why Python doesn't free memory more quickly. You would think the variables going out of scope would be sufficient to free the memory. I hope a Python expert chimes in and explains this. – new name Jun 05 '19 at 18:15
  • Few links that might be helpful : [NDB caching](https://cloud.google.com/appengine/docs/standard/python/ndb/cache), [mleak](https://grokbase.com/t/gg/google-appengine/13ankdy5np/instance-memory), [track Apptrace](https://code.google.com/archive/p/apptrace/wikis/UsingApptrace.wiki) – ASHu2 Jun 09 '19 at 04:51

1 Answers1

2

That graph doesn't look as if your memory usage "continues to increase". Rather, it looks pretty flat. If you had a significant memory leak, the graph would go up instead.

The Python process needs to get memory from the operating system, and then uses that memory to store your Python objects. When Python objects are garbage collected, the memory occupied by those objects becomes free to the Python process, because new objects can be stored there. But to the operating system, that memory is still owned by the Python process, so it is in use. I suppose your graph shows the memory usage of that Python process.

The memory obtained from the operating system is requested in much larger chunks than needed for a single Python object. And it would have to be returned in lager chunks as well. When Python objects are allocated and later garbage-collected, the remaining live objects are spread across the huge memory chunk. If the Python process wanted to release memory back to the operating system, it would have to move all the objects to a compact area, so that a huge, contiguous area becomes free. It's easier for the process, and faster, to just hold on to the memory and re-use it as needed.

Roland Weber
  • 1,865
  • 2
  • 17
  • 27