2

I've read tons of articles and stackoverflow questions, and I saw a lot of information about thread pool, but no one talks about physical CPU core usage. I believe this question is not duplicated.

Given that I have a quad-core computer and libuv thread pool size of 4, will Node.js utilize all those 4 cores when processing lots of i/o requests(maybe more than thousands)?

I'm also curious that which i/o request uses thread pool. No one gives clear and full list of request. I know that Node.js event loop is single threaded but uses a thread pool to handle i/o such as accessing disk and db.

ZeroCho
  • 1,358
  • 10
  • 23

1 Answers1

5

I'm also curious that which i/o request uses thread pool.

Disk I/O uses the thread pool.

Network I/O is async from the beginning and does not use threads.

With disk I/O, the individual disk I/O calls still present to Javascript as non-blocking and asynchronous even though they use threads in their native code implementation. When you exceed more disk I/O calls in process than the size of the thread pool, the disk I/O calls are queued and when one of the threads frees up, the next disk I/O call in the queue will run using that now available thread. Since the Javascript for the disk I/O is all non-blocking and assumes a completion callback will get called sometime in the future, the queuing of requests when the thread pool is all busy just means it will take longer to get to the later I/O requests, but otherwise the Javascript programming interface is not affected.

Given that I have a quad-core computer and libuv thread pool size of 4, will Node.js utilize all those 4 cores when processing lots of i/o requests(maybe more than thousands)?

This is not up to node.js and is hard to answer in the absolute for that reason. The first referenced article below says that on Linux, the I/O thread pool will use multiple cores and offers a small demo app that shows that.

This is up to the specific OS implementation and the thread scheduler that it uses. node.js just happily creates the threads and uses them and the OS then decides how to make use of the CPU given what it is being asked to do overall on the system. Since threads in the same process often have to communicate with one another in some way, using a separate CPU for different threads in the same process is a lot more complicated.

There are a couple node.js design patterns that are guaranteed to take advantage of multiple cores (in any modern OS)

  1. Cluster your app and create as many clusters as you have processor cores. This also has the advantage that each cluster has its own I/O thread pool that can work independently and each can execute it's own Javascript independently. With only one node.js process and multiple cores, you never get more than one thread of Javascript execution (this is where node.js is referred to as single threaded - even though it does use threads in its library implementations). But, with clustering, you get independent Javascript execution for each clustered server process.

  2. For individual tasks that might be CPU-intensive (for example, image processing), you can create a work queue and a pool of child worker processes that you hand work off to. This has some benefits in common with clustering, but it is more special purpose where you know exactly where the CPU bottleneck is and you want to attack it specifically.

Other related answers/articles:

how libuv threads in nodejs utilize multi core cpu

Node.js on multi-core machines

Taking Advantage of Multi-Processor Environments in node.js

When is the thread pool used?

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • Loving this answer, it's very thorough. I would like to mention that Node.js is pretty low on my list of choices as a language to use for any intensive image processing though. Perhaps you could replace that example with something like web crawling? Personally I could see the benefit of using multiple threads to do headless browsing and round-robin the link-following among all the threads. – Patrick Roberts Jul 28 '17 at 05:49
  • @PatrickRoberts - For web crawling with parsing of the crawled results, I would likely just cluster, rather than a work-queue. In that case, you pretty much benefit from the whole app being clustered into multiple processes, not just one piece of it. And, you would only have to coordinate between clusters when getting the next URL to process or saving your results. Everything in between is completely independent. – jfriend00 Jul 28 '17 at 05:53
  • Oh, I missed the bit about the "work queue", I too was just thinking of clustering. I guess in that case image processing is a good example of a task-level parallelization that Node.js could take advantage of. – Patrick Roberts Jul 28 '17 at 05:56
  • 2
    @PatrickRoberts - And FYI the worker processes don't even have to be node.js processes. They could be native code processes or PHP processes or any other type of process. – jfriend00 Jul 28 '17 at 05:57
  • Thank you for the great answer! One last question for clear understanding. If OS implementation is set to use only one CPU not four, does Node.js still create 4 threads(thread pool size that I set)? Does one CPU have to switch context between 4 threads? – ZeroCho Jul 28 '17 at 08:31
  • 1
    @ZeroCho - I believe it will still create multiple threads. Part of the reason for threads for disk I/O is that the native code implementation uses blocking disk I/O so you can still get better throughput with multiple threads even with only one CPU because some of the threads may be blocking at any given moment and the threads allow the others to proceed in parallel. – jfriend00 Jul 28 '17 at 08:48