3

I am working on a Flask app that serves data intensive pages. I would like to store several blobs of data commonly across the application that I anticipate many users will need. So I am looking to instantiate some global data objects when I first start the flask application -- sort of an in-memory curated database, of, say numpy objects. Is there a way to do this? Thank you.

neela
  • 79
  • 1
  • 6

2 Answers2

0

I'm not sure that trying to cache a blob of data that is preserved globally at the application level is possible using vanilla Flask. Additionally, even if you could do it, I'm not so sure that it's a good idea.

Here's why I think that it shouldn't be done: When you have each request stored in its own "sandbox," you know that when a request is made, it's going to access data in that request's sandbox. But if you make the "sandbox" global and accessible to everyone then there's no way to really control access to the sandbox. You could easily end up running into concurrent access problems. Then you'd have to design a queuing process to make sure there is only one request accessing the global sandbox at a time. You may have to extend the builtin objects to include locks (once again, having to do with data security/integrity in a world of multiple access).

Also keep in mind that Flask global g is not the way to get around this problem. Flask g is sort of a misnomer. It's not actually applied to the global application context. In other words, it doesn't live at a level where all users who are accessing the application also have access to g. Rather, g is rewritten at the request level. In other words, it's still accessed (and can be overwritten/modified) at the request level. Check out this SO post for more information.


EDIT: If you really need it I think using a simple cache provided by werkzeug should suffice. Check the docs for usage.

Community
  • 1
  • 1
franklin
  • 1,800
  • 7
  • 32
  • 59
  • Hi @franklin. Thanks for your response. I am aware of the issues you raise. Still, I am curious if there is a solution to the problem because the global that I mention will be read only. Therefore I don't anticipate concurrent access issues. If each user request needs, say 100mb of data to be used, it would be very slow to access it from the database. Also, if different users require the same data, it is inefficient to maintain copies for each user. Therefore, i am thinking that one copy of this data across all instances, held in memory is the most efficient way of using it. Hence my question – neela Feb 27 '16 at 22:40
  • @neela I think it's possible to implement a global cache using either the werkzeug lib as suggested [here](http://flask.pocoo.org/docs/0.10/patterns/caching/) or by using the Flask-Cache plugin. – franklin Feb 27 '16 at 22:43
  • Would you know if when I get data out of the cache, it is copied by reference or by value? I am assuming I cannot store python numpy arrays as they are. So I will need to pickle them on the way in, and pickle them on th eway out -- so when I make copies in multiple instances, would I have multiple copies of the data, or a single copy in cache and multiple references to it? Thanks again. – neela Feb 27 '16 at 22:58
  • I'm not sure. It seems as though numpy lists are an extension of regular python lists. In that case, they are passed by "assignment." Take a look at this [SO post](http://stackoverflow.com/a/986145/778694). It's close to pass by reference. But I think that's beyond the scope of this question. – franklin Feb 27 '16 at 23:14
  • thank you. I will investigate if this will work for me. – neela Feb 27 '16 at 23:29
0

You could put them in a configuration file. Using import my_config would bring the objects in for all users. Now the question is what can the users do with it. If they are expected to write to this data then this might not be the best idea.

So you declare variables in my_config.py

import numpy as np
my_var = np.arange(24).reshape(6,4)

Then you use them in your flask app

import flask
from my_config import my_var

more flask app here...
Back2Basics
  • 7,406
  • 2
  • 32
  • 45
  • thank you for your response. Flask has a general config.py file. Did you mean adding another my_config.py file in the same directory with the global variable declared as null or even instantiated? Or do I declare it in the config file and instantiate it later? I thought the config files are just to specify application level parameters -- did not know that I could assign complex data objects to variables in config files. – neela Feb 27 '16 at 23:32
  • Thank you. I will try this. – neela Feb 28 '16 at 00:18
  • I have a followup.. If there are two calls to the flask app from two users, does the my_config.py run twice, creating two instances of the my_var variable? Also, if the my_config.py is imported in two separate parts of the flask app, is it run twice for the same call to the flask app by one user? – neela Feb 28 '16 at 00:38
  • @neela I don't think so. config is called only to initialize the Flask application. It is request independent. – franklin Feb 28 '16 at 16:25