9

I'd like to use an in-memory thread-local cache for a value from the database that isn't going to change during a request/response cycle, but gets called hundreds (potentially thousands) of times. My limited understanding is that using a "global"/module variable is one way to implement this type of cache.

e.g.:

#somefile.py

foo = None

def get_foo(request):
  global foo
  if not foo:
    foo = get_foo_from_db(request.blah)
  return foo

I'm wondering whether using this type of "global" is thread-safe in python, and that therefore I can be comfortable that get_foo_from_db() will get called exactly once per request/response cycle in django (using either runserver or gunicorn+gevent). Is my understanding correct? This thing gets called enough that even using memcached to store the value is going to be a bottleneck (I'm profiling it as we speak).

B Robster
  • 40,605
  • 21
  • 89
  • 122

2 Answers2

7

No, access to globals is not thread-safe. Threads do not get their own copy of globals, globals are shared among threads.

The code:

if not foo:
    foo = get_foo_from_db(request.blah)

compiles to several python bytecode statements:

  2           0 LOAD_FAST                1 (foo)
              3 POP_JUMP_IF_TRUE        24

  3           6 LOAD_GLOBAL              0 (get_foo_from_db)
              9 LOAD_FAST                0 (request)
             12 LOAD_ATTR                1 (blah)
             15 CALL_FUNCTION            1
             18 STORE_FAST               1 (foo)
             21 JUMP_FORWARD             0 (to 24)

A thread switch can occur after each and every bytecode execution, so another thread could alter foo after you tested it.

Martijn Pieters
  • 1,048,767
  • 296
  • 4,058
  • 3,343
  • Django global variables are not thread-safe. But if we are using workers/processes then it's safe as different processes have their own memory spaces. – Akhil Mathew Aug 11 '22 at 07:39
  • 1
    @AkhilMathew: sure, but **only if you are not also using threads**. But otherwise, separate processes are separate Python programs. This is not Django specific, Django is just another Python library here. – Martijn Pieters Aug 11 '22 at 20:51
4

No, you are wrong on two counts.

Firstly, the use of "threads" is a bit vague here. Depending on how its server is configured, Django can be served either using threads or processes or both (see the mod_wsgi documentation for a full discussion). If there is a single thread per process, then you can can guarantee that only one instance of a module will be available to each process. But that is highly dependent on that configuration.

Even so, it is still not the case that there will be "exactly one" call to that function per request/response cycle. This is because the lifetime of a process is entirely unrelated to that cycle. A process will last for multiple requests, so that variable will persist for all of those requests.

Daniel Roseman
  • 588,541
  • 66
  • 880
  • 895
  • Makes sense. Follow up question here: http://stackoverflow.com/questions/15365780/how-to-implement-thread-safe-in-memory-cache-of-value-in-django – B Robster Mar 12 '13 at 15:49