Original problem
I am creating an API using express that queries a sqlite DB and outputs the result as a PDF using html-pdf module.
The problem is that certain queries might take a long time to process and thus would like to de-couple the actual query call from the node server where express is running, otherwise the API might slow down if several clients are running heavy queries.
My idea to solve this was to decouple the execution of the sqlite query and instead run that on a python script. This script can then be called from the API and thus avoid using node to query the DB.
Current problem
After quickly creating a python script that runs a sqlite query, and calling that from my API using child_process.spawn()
, I found out that express seems to get an exit code signal as soon as the python script starts to execute the query.
To confirm this, I created a simple python script that just sleeps in between printing two messages and the problem was isolated.
To reproduce this behavior you can create a python script like this:
print("test 1")
sleep(1)
print("test 2)
Then call it from express like this:
router.get('/async', function(req, res, next) {
var python = child_process.spawn([
'python3'
);
var output = "";
python.stdout.on('data', function(data){
output += data
console.log(output)
});
python.on('close', function(code){
if (code !== 0) {
return res.status(200).send(code)
}
return res.status(200).send(output)
});
});
If you then run the express server and do a GET /async
you will get a "1" as the exit code.
However if you comment the sleep(1)
line, the server successfully returns
test 1
test 2
as the response.
You can even trigger this using sleep(0)
.
I have tried flushing the stdout before the sleep, I have also tried piping the result instead of using .on('close')
and I have also tried using -u
option when calling python (to use unbuffered streams).
None of this has worked, so I'm guessing there's some mechanism baked into express that closes the request as soon as the spawned process sleeps OR finishes (instead of only when finishing).
I also found this answer related to using child_process.fork() but I'm not sure if this would have a different behavior or not and this one is very similar to my issue but has no answer.
Main question
So my question is, why does the python script send an exit signal when doing a sleep() (or in the case of my query script when running cursor.execute(query)
)?
If my supposition is correct that express closes the request when a spawned process sleeps, is this avoidable?
One potential solution I found suggested the use of ZeroRPC, but I don't see how that would make express keep the connection open.
The only other option I can think of is using something like Kue so that my express API will only need to respond with some sort of job ID, and then Kue will actually spawn the python script and wait for its response, so that I can query the result via some other API endpoint.
Is there something I'm missing?
Edit:
AllTheTime's comment is correct regarding the sleep issue. After I added from time import sleep
it worked. However my sqlite script is not working yet.