1

I am quite confused about why is my promise blocking the node app requests.

Here is my simplified code:

var express = require('express');    
var someModule = require('somemodule');

app = express();

app.get('/', function (req, res) {        
    res.status(200).send('Main');
});

app.get('/status', function (req, res) {
    res.status(200).send('Status');
});         

// Init Promise
someModule.doSomething({}).then(function(){},function(){}, function(progress){
    console.log(progress);        
});

var server = app.listen(3000, function () {
    var host = server.address().address;
    var port = server.address().port;

    console.log('Example app listening at http://%s:%s in %s environment',host, port, app.get('env'));
});

And the module:

var q = require('q');

function SomeModule(){
    this.doSomething = function(){
        return q.Promise(function(resolve, reject, notify){            
            for (var i=0;i<10000;i++){
               notify('Progress '+i);
            }
            resolve();
        });
    }          
}

module.exports = SomeModule;

Obviously this is very simplified. The promise function does some work that takes anywhere from 5 to 30 minutes and has to run only when server starts up. There is NO async operation in that promise function. Its just a lot of data processing, loops etc.

I wont to be able to do requests right away though. So what I expect is when I run the server, I can go right away to 127.0.0.1:3000 and see Main and same for any other requests.

Eventually I want to see the progress of that task by accessing /status but Im sure I can make that work once the server works as expected.

At the moment, when I open / it just hangs until the promise job finishes..

Obviously im doing something wrong...

Tomas
  • 2,676
  • 5
  • 41
  • 51
  • 1
    js is not multithreaded, so if you don't have any async task within your promise and this part needs 30min to complete, then it will block your server for 30min. – t.niese Nov 20 '15 at 22:01
  • ...and even if you do have async tasks, if they are processor bound they will still "block" your event loop. – kliron Nov 20 '15 at 22:06
  • 1
    You'll have to move the process that takes 30 minutes off to it's own child process, or code it in such a way that it doesn't block the event loop. – Kevin B Nov 20 '15 at 22:13
  • @KevinB Thanks, would you mind pointing me in right direction as to how to do that ? – Tomas Nov 20 '15 at 22:17
  • the for loop would become a recursive functino that calls itself on next tick (process.nextTick) but i kinda doubt that's what your actual code does, so fixing it would likely be more complex than that. – Kevin B Nov 20 '15 at 22:23
  • 2
    If your task is IO-bound do as @KevinB says with `process.nextTick`. If your task is CPU-bound that won't help at all. In that case you need to delegate the task to another process. An example solution would be to spawn a child process, do the work and pipe the results back to the parent process when done. See https://nodejs.org/api/child_process.html for more. – kliron Nov 20 '15 at 22:31
  • For specific guidance on making the module code non-blocking, please show us the ACTUAL code, not just a simulation. We can only help with how to change it to be non-blocking if we can see exactly what the real code does. If it is just CPU bound and doesn't have opportunities to use async I/O, then you will just have to move the operation to a child process. – jfriend00 Nov 20 '15 at 22:35
  • jfriend00 There is no point posting the real code, as it really is just couple of very large for loops. @kliron Thanks for the link! If you would write it as a answer I would accept it – Tomas Nov 23 '15 at 14:39

2 Answers2

1

The main thread of Javascript in node.js is single threaded. So, if you do some giant loop that is processor bound, then that will hog the one thread and no other JS will run in node.js until that one operation is done.

So, when you call:

someModule.doSomething()

and that is all synchronous, then it does not return until it is done executing and thus the lines of code following that don't execute until the doSomething() method returns. And, just so you understand, the use of promises with synchronous CPU-hogging code does not help your cause at all. If it's synchronous and CPU bound, it's just going to take a long time to run before anything else can run.

If there is I/O involves in the loop (like disk I/O or network I/O), then there are opportunities to use async I/O operations and make the code non-blocking. But, if not and it's just a lot of CPU stuff, then it will block until done and no other code will run.

Your opportunities for changing this are:

  1. Run the CPU consuming code in another process. Either create a separate program that you run as a child process that you can pass input to and get output from or create a separate server that you can then make async requests to.

  2. Break the non-blocking work into chunks where you execute 100ms chunks of work at a time, then yield the processor back to the event loop (using something like setTimeout() to allow other things in the event queue to be serviced and run before you pick up and run the next chunk of work. You can see Best way to iterate over an array without blocking the UI for ideas on how to chunk synchronous work.

As an example, you could chunk your current loop. This runs up to 100ms of cycles and then breaks execution to give other things a chance to run. You can set the cycle time to whatever you want.

function SomeModule(){
    this.doSomething = function(){
        return q.Promise(function(resolve, reject, notify){
            var cntr = 0, numIterations = 10000, timePerSlice = 100;
            function run() {
                if (cntr < numIterations) {
                    var start = Date.now();
                    while (Date.now() - start < timePerSlice && cntr < numIterations) {
                        notify('Progress '+cntr);
                        ++cntr;
                    }
                    // give some other things a chance to run and then call us again
                    // setImmediate() is also an option here, but setTimeout() gives all
                    // other operations a chance to run alongside this operation
                    setTimeout(run, 10);
                } else {
                    resolve();
                }
            }
            run();
        });
    }
}
Community
  • 1
  • 1
jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • This is directly related to your write up and has visuals in case anyone cares. Cross post from /r/node https://www.youtube.com/watch?v=8aGhZQkoFbQ "Philip Roberts: What the heck is the event loop anyway? | JSConf EU 2014" – Ravenous Nov 23 '15 at 14:47
1

If your task is IO-bound go with process.nextTick. If your task is CPU-bound asynchronous calls won't offer much performance-wise. In that case you need to delegate the task to another process. An example solution would be to spawn a child process, do the work and pipe the results back to the parent process when done.

See nodejs.org/api/child_process.html for more.

If your application needs to do this often then forking lots of child processes quickly becomes a resource hog - each time you fork, a new V8 process will be loaded into memory. In this case it is probably better to use one of the multiprocessing modules like Node's own Cluster. This module offers easy creation and communication between master-worker processes and can remove a lot of complexity from your code.

See also a related question: Node.js - Sending a big object to child_process is slow

kliron
  • 4,383
  • 4
  • 31
  • 47