1

I just realized a problem with the (single-threaded) Node.js:

  1. The server begins responding to a request, and the request runs until it blocks because of I/O.

  2. When the request processor blocks, the server kicks in and goes back to step #1, processing more requests.

  3. Whenever a request processor blocks for I/O, the server checks to see if any request is finished. It processes those in FIFO order to respond to clients, then continues processing as before.

Doesn't that mean that there should be a stack overflow at #2, if too many requests start blocking each other and none of them finishes? Why/why not?

user541686
  • 205,094
  • 128
  • 528
  • 886
  • Sharing the same "stack" for all requests would be completely impractical - how could that possibly work for a single thread and multiple requests? I'm guessing each requests has its own, heap-allocated (or equivalent), state. – Mat Jan 15 '12 at 10:18
  • @Mat: It's quite possible with something like [QueueUserAPC](http://msdn.microsoft.com/en-us/library/windows/desktop/ms684954.aspx) -- it's just that it blows up after a certain point. So you suspect that JS not really using the CPU stack for servicing threads at all? – user541686 Jan 15 '12 at 10:19
  • That function you link to doesn't add anything to a thread's CPU stack, it adds stuff to a (separately allocated) queue. Such a technique could very well be used by node.js, though they'd probably use something portable (and quite possibly homegrown). Try and come up with a way to use the actual CPU stack in the scenario you describe above, you'll see it can't work. – Mat Jan 15 '12 at 10:23
  • @Mat: No, it *does* work -- whenever the thread sleeps in an alertable state (e.g. with [WaitForSingleObjectEx](http://msdn.microsoft.com/en-us/library/windows/desktop/ms687036.aspx)), any queued APCs are called on the thread, and the wait is then satisfied with `WAIT_IO_COMPLETION`. It works quite well. – user541686 Jan 15 '12 at 10:29
  • That's not the point. The queued objects are not put on the thread's CPU stack. The thread is (temporarily) diverted to process objects stored in the queue. Its CPU stack is used for that, but that just one extra frame. When one queued object is processed, the stack goes back to its initial state (and that process restarts). There is no risk of bursting the stack with this technique (as long as 1. the stack is not "full" when the process is alerted and 2. the data processing doesn't itself blow it). There is no increase in the thread stack that depends on the data queued (size or n° of items). – Mat Jan 15 '12 at 10:33
  • @Mat: I never said it depends on the *data* though. I said it depends on the *number* of APCs queued, because if an APC *itself* blocks, it'll start *another* APC on the *same* thread, and the stack continuously grows... until it pops. – user541686 Jan 15 '12 at 10:36
  • Hadn't realized that blocking within the APC was both allowed and resulted in _recursive_ APC processing. Yes, as documented, that could blow your stack (which is pretty much what I meant by "cannot work" - with that scheme, APC completions that were "stacked on top of each-other" no longer are independent of each other, so this scheme would not work for node.js in my understanding). A separate state, with a proper state-machine, doesn't require this "recursive stacking". – Mat Jan 15 '12 at 10:52
  • (Or, quite possibly, I'm full of it and I'm missing your point, sorry.) – Mat Jan 15 '12 at 10:54
  • @Mat: Yes we both agree that this could blow up the stack, so my exact question is, how come this doesn't happen in Node? – user541686 Jan 15 '12 at 10:58

3 Answers3

4

node.js prevents the stack overgrowth you describe by using asynchronous techniques everywhere1.

Anything that could block uses callbacks for further processing, not blocking calls. This avoids stack growth completely, and makes it easy to re-enter the event loop (that "drives" the underlying real I/O and request dispatching).

Consider this pseudo-code:

fun() {
  string = net.read();
  processing(string);
}

Thread is blocked on read, stack can only be free'd up after both the read completes, and processing is done.

Now if all your code is like:

fun() {
  net.read(onDone: processing(read_data));
}

And if you implement read like this:

net.read(callback) {
  iorequest = { read, callback };
  io.push_back(iorequest);
}

fun is done as soon as read can queue a read I/O with the associated callback. fun's stack is rewound without blocking - it returns "immediately" to the event loop without any thread stack leftovers.

I.e. you can move on to the next callback (re-enter the event loop) without keeping any per-request data on the thread stack.

So node.js avoid stack overgrowth by using asynchronous callbacks wherever blocking calls would happen in "user" code.

For more about this, please check out the node.js 'about' page, and the first set of slides linked at the end.

1well, nearly I guess


You mention QueueUserAPC in a comment. With that type of processing, a queued APC is allowed to block, and the next APC in the queue gets processed on the thread's stack, making it a "recursive" dispatch.

Say we have three APCs pending (A, B and C). We get:

Initial state:

Queue   ABC
Stack   xxxxxxxx

Thread sleeps so APC dispatch starts, enters processing for A:

Queue   BC
Stack   AAAAxxxxxxxx

A blocks, B is dispatched on the same stack:

Queue   C
Stack   BBBBBBAAAAxxxxxxxx

B blocks, C is dispatched:

Queue   
Stack   CCCCCCCBBBBBBAAAAxxxxxxxx

It's clearly visible that if enough blocking APCs are pending, the stack will eventually blow up.

With node.js, the requests are not allowed to block. Instead, here's a mock-up of what would happen for the same three requests:

Queue      ABC
Stack      xxxxxxxx

A starts processing:

Queue      BC
Stack      AAAAxxxxxxxx

Now A needs to do something that blocks - in node.js, it actually can't. What it does is queue another request (A') (presumably with a context - simplistically a hash with all your variables):

I/O queue  A'
Queue      BC
Stack      AAAAxxxxxxxx

Then A returns and were's back to:

I/O queue  A'
Queue      BC
Stack      xxxxxxxx

Notice: no more A stackframe. The I/O pending queue is actually managed by the OS (using epoll or kqueue or whatever). The main thread checks both the OS I/O ready states and pending (needing CPU) queues in the event loop.

Then B gets some CPU:

I/O queue  A'
Queue      C
Stack      BBBBBBBxxxxxxxx

Same thing, B wants to do I/O. It queues a new callback and returns.

I/O queue  A'B'
Queue      C
Stack      xxxxxxxx

If B's I/O request completes in the mean time, the next snapshot could look like

I/O queue  A'
Queue      B'
Stack      CCCCCxxxxxxxx

At no point is there more than one callback stack frame on the processing thread. Blocking calls are not provided by the API, that stack doesn't exhibit the type of recursive growth the APC pattern does.

Mat
  • 202,337
  • 40
  • 393
  • 406
  • You can "release" a thread's stack? How? Isn't the callback going to be called on that same thread (and stack)? – user541686 Jan 15 '12 at 11:33
  • "release" wasn't the right term. `fun`'s stack frame is very short-lived, `fun` returns as soon as the request is queued. `fun` could itself be a callback, the point is that each callback returns to the event loop without blocking. There is no recursive stack growth for event processing with this approach, the next callback is started from the toplevel event loop, not a nested stack frame. – Mat Jan 15 '12 at 12:04
2

node.js is based on Google's V8 JavaScript engine which utilises an Event Loop.

See

Community
  • 1
  • 1
pyrotechnick
  • 391
  • 3
  • 5
0

There are a few key aspects of working with the event loop in Node.js that are different from working with threads.

In Node.js, the runtime does not interrupt your function in the middle to start executing another function. Instead, you must return from the current function before the Node.js concurrency will kick in.

function readAndWriteItem(id) {
  var callback = function(item) {
    item.title = item.title + " " + item.title;
    writeItem(item);
  };
  readItem(id, callback);
};

Node that in this example, a callback closure is created and readItem is called. Presumably, readItem will queue up the issuing of a query and set up its own internal callback to be executed when the result of the query is ready. So this example function readAndWriteItem merely queues up a message to be sent across the wire, sets up some more callbacks, and immediately returns. Once this function returns, Node.js can work its event loop magic.

Since the function has returned, and since this is the case across the board when you use Node.js, there is no stack overlow.

yfeldblum
  • 65,165
  • 12
  • 129
  • 169