117

So I have an understanding of how Node.js works: it has a single listener thread that receives an event and then delegates it to a worker pool. The worker thread notifies the listener once it completes the work, and the listener then returns the response to the caller.

My question is this: if I stand up an HTTP server in Node.js and call sleep on one of my routed path events (such as "/test/sleep"), the whole system comes to a halt. Even the single listener thread. But my understanding was that this code is happening on the worker pool.

Now, by contrast, when I use Mongoose to talk to MongoDB, DB reads are an expensive I/O operation. Node seems to be able to delegate the work to a thread and receive the callback when it completes; the time taken to load from the DB does not seem to block the system.

How does Node.js decide to use a thread pool thread vs the listener thread? Why can't I write event code that sleeps and only blocks a thread pool thread?

Rahil Wazir
  • 10,007
  • 11
  • 42
  • 64
Haney
  • 32,775
  • 8
  • 59
  • 68
  • @Tobi - I have seen that. It still doesn't answer my question. If the work was on another thread, the sleep would only affect that thread and not the listener as well. – Haney Mar 25 '14 at 19:28
  • 11
    A genuine question, where you try to understand something by yourself, and when you can't find an exit to the maze, you ask for help. – Rafael Eyng May 27 '15 at 23:29

4 Answers4

273

Your understanding of how node works isn't correct... but it's a common misconception, because the reality of the situation is actually fairly complex, and typically boiled down to pithy little phrases like "node is single threaded" that over-simplify things.

For the moment, we'll ignore explicit multi-processing/multi-threading through cluster and webworker-threads, and just talk about typical non-threaded node.

Node runs in a single event loop. It's single threaded, and you only ever get that one thread. All of the javascript you write executes in this loop, and if a blocking operation happens in that code, then it will block the entire loop and nothing else will happen until it finishes. This is the typically single threaded nature of node that you hear so much about. But, it's not the whole picture.

Certain functions and modules, usually written in C/C++, support asynchronous I/O. When you call these functions and methods, they internally manage passing the call on to a worker thread. For instance, when you use the fs module to request a file, the fs module passes that call on to a worker thread, and that worker waits for its response, which it then presents back to the event loop that has been churning on without it in the meantime. All of this is abstracted away from you, the node developer, and some of it is abstracted away from the module developers through the use of libuv.

As pointed out by Denis Dollfus in the comments (from this answer to a similar question), the strategy used by libuv to achieve asynchronous I/O is not always a thread pool, specifically in the case of the http module a different strategy appears to be used at this time. For our purposes here it's mainly important to note how the asynchronous context is achieved (by using libuv) and that the thread pool maintained by libuv is one of multiple strategies offered by that library to achieve asynchronicity.


On a mostly related tangent, there is a much deeper analysis of how node achieves asynchronicity, and some related potential problems and how to deal with them, in this excellent article. Most of it expands on what I've written above, but additionally it points out:

  • Any external module that you include in your project that makes use of native C++ and libuv is likely to use the thread pool (think: database access)
  • libuv has a default thread pool size of 4, and uses a queue to manage access to the thread pool - the upshot is that if you have 5 long-running DB queries all going at the same time, one of them (and any other asynchronous action that relies on the thread pool) will be waiting for those queries to finish before they even get started
  • You can mitigate this by increasing the size of the thread pool through the UV_THREADPOOL_SIZE environment variable, so long as you do it before the thread pool is required and created: process.env.UV_THREADPOOL_SIZE = 10;

If you want traditional multi-processing or multi-threading in node, you can get it through the built in cluster module or various other modules such as the aforementioned webworker-threads, or you can fake it by implementing some way of chunking up your work and manually using setTimeout or setImmediate or process.nextTick to pause your work and continue it in a later loop to let other processes complete (but that's not recommended).

Please note, if you're writing long running/blocking code in javascript, you're probably making a mistake. Other languages will perform much more efficiently.

Community
  • 1
  • 1
Jason
  • 13,606
  • 2
  • 29
  • 40
  • 1
    Holy crap, this completely clears it up for me. Thank you so much @Jason! – Haney Mar 25 '14 at 19:55
  • 7
    No problem :) I found myself where you are not too long ago, and it was tough to come to a well defined answer because on one side you have C/C++ devs for whom the answer is obvious, and on the other you have typical web devs who haven't delved too deeply into these sorts of questions before. I'm not even sure my answer is 100% technically correct when you get down to the C level, but it's right in the broad strokes. – Jason Mar 25 '14 at 20:00
  • Exactly. I feel like the majority of developers suffer from the Leaky Abstraction, especially with an environment and paradigm as *opinionated* as Node.js - many love to use it and benefit from the async pattern greatly; few understand what's actually happening because they never feel the need to write async libraries themselves. – Haney Mar 25 '14 at 20:12
  • 4
    Using the thread pool for network requests would be a huge resource waste. According to [this question](http://stackoverflow.com/questions/15526546/confusion-about-node-js-internal-asynchronous-i-o-mechanism) "It does the async network I/O based on the async I/O interfaces in different platforms, such as epoll, kqueue and IOCP, without a thread pool" -- which makes sense. – Denis Dollfus Dec 16 '14 at 14:52
  • Thanks for that clarification, @DenisDollfus, I'll update the answer appropriately to indicate that async I/O uses different strategies depending on the function. – Jason Dec 16 '14 at 17:32
  • 1
    How is `nodejs` able to handle concurrent connections, provided it runs on a single thread? If I get `1000` concurrent requests, will each request be handled one after the another? If yes, will not there be a unwanted lag in the response sent to each request? What will happen if it receives `1000 * 1000` concurrent requests? – Suhail Gupta May 04 '16 at 07:01
  • @SuhailGupta there's a couple answers to your question - I'll just hit the highlights. First, you *can* use more than one process in your node application through the use of `cluster`. Second, there are practical examples of people achieving over 100k concurrent requests - that means without lag. The reason this works is because a properly written node application will do very little actual work in its one thread, it will pass most of the work to asynchronous modules written with `libuv`. It may someday hit 1m requests, but nothing else out there is close to that either. – Jason May 04 '16 at 11:21
  • 1
    @Jason What my perception (without the use of any cluster), is that for 1000 simultaneous HTTP requests each request will be handled one after the another so the last request (1000th request) will be served late in comparison to the request served first. (A lag seems unavoidable here). – Suhail Gupta May 04 '16 at 12:17
  • Your perception is incorrect. For 1000 simultaneous HTTP requests, each request will *start* being handled serially, but the first request won't be completed until well after all 1000 requests are already started (for any non-trivial application), the result being that they all finish in about the same time it takes to finish one. The reason for this is that even though the javascript you write runs in a single thread, the (node.js library) modules that you use in your javascript are written in other languages and use the thread pool (see: my answer that we're commenting on) – Jason May 04 '16 at 14:57
  • 1
    ... that said, if you do some heavy lifting in the main javascript thread directly, or you don't have enough resources or don't manage them appropriately to give enough headroom to the threadpool, you could introduce lag at a lower concurrency threshold - the upshot is that, for the same system resources, you'll typically experience higher thruput with node.js than with other options (though there are other event-based systems in other languages that aim to challenge that - I haven't seen recent benchmarks though) - it's clear that an event based model outperforms a threaded model. – Jason May 04 '16 at 15:03
  • @Jason Okay. A threaded model starts a new thread each time a request is received. Event based model will not invoke a new thread each time but use the already available threads in the pool. What is the difference? A thread in the pool will also occupy memory and so will a thread just invoked to handle a new request. – Suhail Gupta May 05 '16 at 06:08
  • We're moving well beyond the scope of this question, I'd suggest you go ask a new question so you can get a proper answer, at a minimum you'll get plenty of links to already existing explanations. – Jason May 05 '16 at 11:34
  • When a single iteration of event loop finishes (executing my code, running on the main thread), the main thread will check the event queue to see if theres any new event or callback that should be called (possibly using a thread from thread pool), and if not, it will execute my code again in the next iteration? – user5539357 Jun 24 '16 at 11:13
  • I mean, there's a single, main thread that executes my code and checks for events. It executes my code only once, and then starts checking for new events in the event queue - if an event happens, its callback will be executed either on the main thread (the one that checked for events in event queue) or in the thread pool, right? – user5539357 Jun 24 '16 at 11:22
  • Mostly correct. The specifics may have changed a bit since I last looked into it, but as I recall it worked like this: when execution starts, the first iteration of the event loop runs your code. If you make a call to an asynchronous (evented) method, it sets up the event and, if applicable, begins the asynchronous execution. The loop finishes and then at the beginning of the next iteration the first thing it does is check for fired events. If it finds one, it executes it in the main thread, handling async calls as before. – Jason Jun 24 '16 at 11:34
  • If you make a call to an asynchronous (evented) method, it sets up the event - what do you mean? If I call an asynchronous method, theres no event, I'm just calling a method. – user5539357 Jun 24 '16 at 14:00
  • If there is no event, the method is not asynchronous. The event could be "DB call finishes and returns" or it could be "next event loop" or "3 seconds have passed" or "network call has returned". If it's not waiting for some future event to complete, then it's running immediately and synchronously. – Jason Jun 24 '16 at 14:16
  • Oh, sure, right. But its generated AFTER the async method completes, not before. So it first begins the asynchronous execution and then sets up an event. – user5539357 Jun 24 '16 at 14:21
  • What I should have said is the event *listener* is set up. – Jason Jun 24 '16 at 14:31
  • @Jason Suppose my single DB query take 2sec to complete then 10 queries take how much time? according to your answer it should take 4 sec because queries will execute in set of 5( 4 by worker thread, 1by listner thread). am i right? – Aabid Feb 08 '18 at 16:58
  • 1
    @Aabid The listener thread does not execute a database query, so it'll take roughly 6 seconds for all 10 of those queries to complete (by the default thread pool size of 4). If you need to do any work in javascript that does not require the results of that database query to complete, e.g. more requests come in that do not require any asynchronous work to be completed by the thread pool, it will continue to work in the main event loop. – Jason Feb 08 '18 at 17:44
  • Thank you. Is there any limitation for setting the UV_THREADPOOL_SIZE? I mean is it dependent to my hardware or maybe OS or anything else? – Shahin Ghasemi Oct 22 '19 at 08:58
  • According to [the libuv documentation](http://docs.libuv.org/en/v1.x/threadpool.html) the maximum is currently set to `1024`. But, the optimal setting for you is likely to be determined by both your hardware *and* the type of workload. I found a [good explanation](https://github.com/nodejs/node/issues/22468#issuecomment-416795256) of the things you need to consider when tuning `UV_THREADPOOL_SIZE` - some types of work should be geared to match thread count to CPU count, other types could be some multiple of that. – Jason Oct 22 '19 at 14:59
23

So I have an understanding of how Node.js works: it has a single listener thread that receives an event and then delegates it to a worker pool. The worker thread notifies the listener once it completes the work, and the listener then returns the response to the caller.

This is not really accurate. Node.js has only a single "worker" thread that does javascript execution. There are threads within node that handle IO processing, but to think of them as "workers" is a misconception. There are really just IO handling and a few other details of node's internal implementation, but as a programmer you cannot influence their behavior other than a few misc parameters such as MAX_LISTENERS.

My question is this: if I stand up an HTTP server in Node.js and call sleep on one of my routed path events (such as "/test/sleep"), the whole system comes to a halt. Even the single listener thread. But my understanding was that this code is happening on the worker pool.

There is no sleep mechanism in JavaScript. We could discuss this more concretely if you posted a code snippet of what you think "sleep" means. There's no such function to call to simulate something like time.sleep(30) in python, for example. There's setTimeout but that is fundamentally NOT sleep. setTimeout and setInterval explicitly release, not block, the event loop so other bits of code can execute on the main execution thread. The only thing you can do is busy loop the CPU with in-memory computation, which will indeed starve the main execution thread and render your program unresponsive.

How does Node.js decide to use a thread pool thread vs the listener thread? Why can't I write event code that sleeps and only blocks a thread pool thread?

Network IO is always asynchronous. End of story. Disk IO has both synchronous and asynchronous APIs, so there is no "decision". node.js will behave according to the API core functions you call sync vs normal async. For example: fs.readFile vs fs.readFileSync. For child processes, there are also separate child_process.exec and child_process.execSync APIs.

Rule of thumb is always use the asynchronous APIs. The valid reasons to use the sync APIs are for initialization code in a network service before it is listening for connections or in simple scripts that do not accept network requests for build tools and that kind of thing.

Peter Lyons
  • 142,938
  • 30
  • 279
  • 274
  • 1
    Where are these asynchronous APIs coming from? I get what you're saying, but whoever wrote this APIs opted into IOCP/async. How did they choose to do this? – Haney Mar 25 '14 at 19:41
  • OK, so how would I, as a Node.js library developer, opt in to these IO processing threads? That's really my question. – Haney Mar 25 '14 at 19:43
  • Every API in the node.js core that takes a callback function is asynchronous. This is a design decision Ryan Dahl made when he first created node.js. Basically Ryan designed the node.js API this way because in his opinion this way a good approach to writing high-performance, high-concurrency network servers. – Peter Lyons Mar 25 '14 at 19:43
  • You don't opt in. You get them all the time by default. You don't really opt-out either. You use the synchronous version by mistake and your program doesn't work properly. Ignore the *Sync APIs in core. They are for specialized and unusual use cases and you don't need them at all when writing typical node.js programs. – Peter Lyons Mar 25 '14 at 19:45
  • 3
    His question is how he would write his own time intensive code and not block. – Jason Mar 25 '14 at 19:46
  • That doesn't answer my question, Peter. Someone, somewhere opted into the IO pool. Programs do exactly what you tell them to; there is no non-determinism. Unless Node.js uses a random calculation method to decide when to use the pool and when not to, Mongoose knows how to opt into the pool and wrote the code to do so. How do I write the same code to opt-in myself with my own lib? – Haney Mar 25 '14 at 19:47
  • "All network IO is asynchronous". Really. You don't opt in. Every DB call you make gets the "pool" as you call it automatically. It is impossible to opt out as there is no such API available in node. node.js just works this way all the time by design. There is no synchronous networking in node.js. Your word choice makes it sound like you aren't actually trying to comprehend what I'm saying. – Peter Lyons Mar 25 '14 at 19:52
  • Peter, is it then the case that all libs which use networking will call Node.js's networking methods, which in turn use the pool? Is that what you're telling me? All libs share the common Node.js methods which themselves abstract the pool away? And there's no other option? – Haney Mar 25 '14 at 19:53
  • Code snippets will guide this discussion to useful places vs. paragraphs of ill-defined terms and semantic debate. Post a sample program to illustrate your point and question precisely, please. – Peter Lyons Mar 25 '14 at 19:54
  • 1
    Yes. Node provides basic UDP, TCP, and HTTP networking. It provides ONLY asynchronous "pool-based" APIs. All node.js code in the world without exception uses these pool-based asynchronous APIs as there are simply all that is available. Filesystem and child processes are a different story, but networking is consistently asynchronous. – Peter Lyons Mar 25 '14 at 19:56
  • 5
    Careful, Peter, lest you be the proverbial pot to his kettle. He wants to know how the writers of the network API did it, not how people who use the network API do it. I eventually gained an understanding of how node behaves re: non-blocking events because I wanted to write my own non-blocking code that has nothing to do with networking or any of the other built in asynchronous APIs. It's pretty clear David wants to do the same. – Jason Mar 25 '14 at 19:56
  • I don't know what he wants to know because his question is long and unclear. But they did it by writing libuv (eventually). https://github.com/joyent/libuv – Peter Lyons Mar 25 '14 at 19:57
  • 2
    Node doesn't use thread pools for IO, it uses native non-blocking IO, the only exception is `fs`, as far as I know – vkurchatkin Mar 25 '14 at 20:20
  • @vkurchatkin native non-blocking IO is an IOCP which employs a thread pool. – Haney Mar 25 '14 at 20:28
  • @vkurchatkin is right, I'm surprised to see all answers mentioning "all io happens in thread pool". This is just wrong: most of io is in the same thread, which uses libuv asynchronous networking under the hood (epoll/kqueue/IOCP system call in linux/bsd/windows). Thread pool IO is only for file operations on linux where you can't set file handle to be non-blocking – Andrey Sidorov Mar 25 '14 at 22:27
  • @DavidHaney no it's not, with IOCP/windows even for file operations – Andrey Sidorov Mar 25 '14 at 22:29
  • @AndreySidorov OK, so you've told me what it isn't. Care to enlighten me? – Haney Mar 26 '14 at 13:32
  • IOCP is used on windows and when it used, it works with the same event loop (not a separate thread) as epoll based io. event loop happens in the same thread javascript vm is executed – Andrey Sidorov Mar 26 '14 at 20:28
12

Thread pool how when and who used:

First off when we use/install Node on a computer, it starts a process among other processes which is called node process in the computer, and it keeps running until you kill it. And this running process is our so-called single thread.

enter image description here

So the mechanism of single thread it makes easy to block a node application but this is one of the unique features that Node.js brings to the table. So, again if you run your node application, it will run in just a single thread. No matter if you have 1 or million users accessing your application at the same time.

So let's understand exactly what happens in the single thread of nodejs when you start your node application. At first the program is initialized, then all the top-level code is executed, which means all the codes that are not inside any callback function (remember all codes inside all callback functions will be executed under event loop).

After that, all the modules code executed then register all the callback, finally, event loop started for your application.

enter image description here

So as we discuss before all the callback functions and codes inside those functions will execute under event loop. In the event loop, loads are distributed in different phases. Anyway, I'm not going to discuss about event loop here.

Well for the sack of better understanding of Thread pool I a requesting you to imagine that in the event loop, codes inside of one callback function execute after completing execution of codes inside another callback function, now if there are some tasks are actually too heavy. They would then block our nodejs single thread. And so, that's where the thread pool comes in, which is just like the event loop, is provided to Node.js by the libuv library.

So the thread pool is not a part of nodejs itself, it's provided by libuv to offload heavy duties to libuv, and libuv will execute those codes in its own threads and after execution libuv will return the results to the event in the event loop.

enter image description here

Thread pool gives us four additional threads, those are completely separate from the main single thread. And we can actually configure it up to 128 threads.

So all these threads together formed a thread pool. and the event loop can then automatically offload heavy tasks to the thread pool.

The fun part is all this happens automatically behind the scenes. It's not us developers who decide what goes to the thread pool and what doesn't.

There are many tasks goes to the thread pool, such as

-> All operations dealing with files
->Everyting is related to cryptography, like caching passwords.
->All compression stuff
->DNS lookups
Rafiq
  • 8,987
  • 4
  • 35
  • 35
  • Do you mind clarifying one thing: so when we create a worker thread are these threads being utilized from thread pool or being spawned a new? – MiKr13 Feb 01 '23 at 14:04
0

This misunderstanding is merely the difference between pre-emptive multi-tasking and cooperative multitasking...

The sleep turns off the entire carnival because there is really one line to all the rides, and you closed the gate. Think of it as "a JS interpreter and some other things" and ignore the threads...for you, there is only one thread, ...

...so don't block it.