I need to have a web server that acts as an API. Every 5 minutes, I'm parsing a huge 300MB file from a remote location and build a big graph of many Python objects. The API needs to run algorithms on the graph. Because the nature of the graph and the algorithms are extremely complex, I'd prefer not to use SQL to store and query the data. Parsing the file on every API call is out of the question as the file is huge.
The main reason why I resorted to global variables is because building the graph takes a lot of time. If I used a database, then I'd still have to build the graph for every API request. If the graph is always available then that would significantly shorten the time. Yes, it's an unorthodox project.
Currently this is what I have so far; I'm subclassing Thread and making it update its member, which is the graph. So something like this:
from flask import Flask, render_template
from threading import Thread
import time
app = Flask(__name__,
static_folder="./dist/static",
template_folder="./dist")
class AutoUpdater(Thread):
def __init__(self):
Thread.__init__(self)
self.daemon = True
self.graph = None
self.start()
def run(self):
while True:
# Build the graph and update self.graph
time.sleep(5 * 60)
A = AutoUpdater()
@app.route('/')
def hello_world():
# Run algorithms with A.graph
A.graph
will never be modified by the user, except by the scheduled task I run; the user only queries graph algorithms which do not modify the graph. I am aware that the A
isn't thread safe, but in this case is it stable? And suppose if the user did have to modify the graph, is it still stable if I implement locks?