0

I'd like to share some large python objects in Django. They are just big tables of data that I'd like to quickly randomly access in memory. Think of just reading a dict that's, say 35M on disk. So, not huge, not small. I'm considering them immutable. That is, read in on initialization, never change them. I'm willing to restart the server to get changes.

What is the best, most Django-friendly way to do this?

This question is like mine. This answer describes how to use Django's low-level in-memory cache. Reading the documentation, there is an in-memory cache that is in-process and thread-safe. Perfect. However, only objects that can be pickled. I don't want my 35M python object pickled, that seems awkward. And then does getting it back out unpickle it again? Per request? That sounds slow.

This blog post mentions django-lrucache-backend, that skips the pickling. However, it was last updated 2 years ago, and also says not to use it for "large data tables" (not sure why).

Recommendations?

EDIT: I understand the traditional answer, but I'd rather avoid pickling and Redis. Two reasons: 1) I'd rather avoid writing a bunch of lines of code (pickling) or maintaining another component (Redis), 2) it seems slower to unpickle large objects (is it on every request?).

dfrankow
  • 20,191
  • 41
  • 152
  • 214

2 Answers2

0

Depending on the object you want to store you need to pickle and unpickle. But this is not a performance issue. You have two possibilities, if it is a dict you can use a JSON structure otherwise just use django-redis as cache backend and let django store the object in the cache (redis). Django-redis supports also connection pooling.

  • But I don't want to pickle and unpickle, I don't want to write code making a dict into json, I don't want to install and maintain redis. I just want to use an object in memory. – dfrankow Dec 18 '22 at 13:13
  • It will not work. A python object is a specific programming object. Every storage can store only binary or text format. – Leonardo Di Lella Dec 19 '22 at 12:52
  • Thanks for your response. I don't want to store, only hold in memory, in a specific process. – dfrankow Dec 19 '22 at 20:03
  • Store or hold is the same, it must be an interchange format and this is not a python object. – Leonardo Di Lella Dec 19 '22 at 22:26
  • I don't know where "must be" comes from. For a computer, many things are possible, including sharing python objects in memory. – dfrankow Dec 20 '22 at 14:47
  • Think of a tensorflow model. It's a binary blob that could take a long time to pickle and unpickle. – dfrankow Feb 07 '23 at 00:21
0

I ended up hanging my data off of the Django AppConfig object, specifically the ready method.

Others also seem to do this, for example here. That example didn't use the ready method, but it did use AppConfig.

dfrankow
  • 20,191
  • 41
  • 152
  • 214