Flask-Cache memory usage when getting several get requests with multiprocessing=True

Question

Imagine the following scenario:

from flask.ext.cache import Cache


@cache.memoize
def my_cached_func():
    x = load_some_big_csv()
    return x
    

@app.route('')
def index():
    get_maybe_cached = my_cached_func()
    return get_maybe_cached

with SimpleCache as a backend.

A) If i have 16 cores, and a user hits core 1, will the cache only be filled on this 1 core or will the cached value be accessible from all cores? Based on the principle of python-multiprocessing it will be only cached on 1 core, right?

Therefore if 16 users hit my endpoint i ll end up with 16x the same cache value stored in my ram.

B)

What happens if 10 users hit endpoint 'x' on the same core1. Will flask now get the pointer that references the value (No additional memory allocated for 10 request) in memory or the value itself (Memory consumption 10x during the 10 sequential but close to parallel requests) ?

A) yes B) users on same core share the same cached instance so no extra memory see https://stackoverflow.com/a/13531014/202168 — Anentropic, Jun 17 '22 at 16:14
B) Does that mean that my cache is mutable? Aka performing an append on a list that comes from the cache will change the returned value from the cache for every user that comes next? @Anentropic — zacko, Jun 17 '22 at 16:28
It seems flask-cache uses werkzeug.contrib.cache, and that code uses pickle dumps/loads https://github.com/pallets/werkzeug/blob/0.15.x/src/werkzeug/contrib/cache.py#L322 ...which will ensure that code cannot mutate the value in the cache itself, only via the cache interface — Anentropic, Jun 17 '22 at 16:33
So... requests on same core are sequential so that should mean no extra memory still. But if you also have multithreading (or gevent, asyncio etc) on top of multiple processes then you could have extra memory used as concurrent requests on same core could fetch multiple instances from the cache that would be kept in memory simultaneously — Anentropic, Jun 17 '22 at 16:36
But pickle.loads() will generate a new reference each time. How is it possible that no extra memory is being used? @Anentropic — zacko, Jun 17 '22 at 16:42
Because the new instances are not used concurrently, unless you add multithreading on top of multiprocessing (which is a common deployment strategy). i.e. I'd expect them to go out of scope and get GC'd when one request has been processed and before next request is handled — Anentropic, Jun 17 '22 at 18:32

Flask-Cache memory usage when getting several get requests with multiprocessing=True

0 Answers0