57

I am trying to understand Threading in NodeJS and how it works.

Currently what i understand:

Cluster: -

  • Built on top of Child_process, but with TCP distributed between clusters.
  • Best for distributing/balancing incoming http requests, while bad for cpu intensive tasks.
  • Works by taking advantage of available cores in cpu, by cloning nodeJS webserver instances on other cores.

Child_process:

  • Make use also of different cores available, but its bad since it costs huge amount of resources to fork a child process since it creates virtual memory.

  • Forked processes could communicate with the master thread through events and vice versa, but there is no communication between forked processes.

Worker threads:

  • Same as child process, but forked processes can communicate with each other using bufferArray

1) Why worker threads is better than child process and when we should use each of them?

2) What would happen if we have 4 cores and clustered/forked nodeJS webserver 4 times(1 process for each core), then we used worker threads (There is no available cores) ?

Abdelfattah
  • 655
  • 1
  • 7
  • 7
  • 5
    Sounds like the classical tradeoff between threads and processes. Threads share the same memory space, so context switch is faster (basically, just the register contents are saved per thread). However, synchronizing between them and ensuring mutual exclusion (global data integrity) is under your own responsibility (as a programmer). Processes run in different memory spaces, so mutual exclusion is guaranteed by the OS, but context switch is slower of course. – goodvibration May 26 '19 at 10:32
  • 1
    Fault tolerance is the primary and essential benefit of a process. When code suffers an unrecoverable error, like the one this web-site is named after, then the OS can cleanly terminate that code. Memory isolation is essential to make that work properly. But if a system requires multiple chunks of code to work together then it becomes a strong liability. The memory isolation considerably increases the interaction cost. And much worse, it can be extremely hard to recover when one process dies or stops being responsive. – Hans Passant May 26 '19 at 11:07
  • 1
    I found a blog post that explains the differences extremely well. https://alvinlal.netlify.app/blog/single-thread-vs-child-process-vs-worker-threads-vs-cluster-in-nodejs – lesterfernandez Oct 13 '22 at 22:40

1 Answers1

31

You mentioned point under worker-threads that they are same in nature to child-process. But in reality they are not.

Process has its own memory space on other hand, threads use the shared memory space.

Thread is part of process. Process can start multiple threads. Which means that multiple threads started under process share the memory space allocated for that process.

I guess above point answers your 1st question why thread model is preferred over the process.

2nd point: Lets say processor can handle load of 4 threads at a time. But we have 16 threads. Then all of them will start sharing the CPU time.

Considering 4 core CPU, 4 processes with limited threads can utilize it in better way but when thread count is high, then all threads will start sharing the CPU time. (When I say all threads will start sharing CPU time I'm not considering the priority and niceness of the process, and not even considering the other processes running on the same machine.)

My Quick search about time-slicing and CPU load sharing:

  1. https://en.wikipedia.org/wiki/Time-sharing
  2. https://www.tutorialspoint.com/operating_system/os_process_scheduling_qa2.htm

This article even answers how switching between processes can slow down the overall performance.

Worker threads are are similar in nature to threads in any other programming language.

You can have a look at this thread to understand in overall about difference between thread and process: What is the difference between a process and a thread?

Sudhir Dhumal
  • 902
  • 11
  • 22
  • 2
    So do you mean that in `child process` it uses another core so it can't has shared memory with main process/thread, while `worker threads` operate on the same core so that they can have shared memory? also i know that in a given process only one thread at a time would operate so according to node documentation `worker_threads module enables the use of threads that execute JavaScript in parallel` what they mean by executing code in parallel if only one thread would operate at a single moment? – Abdelfattah May 26 '19 at 11:32
  • 1. Child process can communicate with parent using IPC, which means each process has its own allocated space, parent process just create the new process and OS maintains its tree so its called as parent process otherwise they are two different processes. CPU core and memory sharing are two different things. If you monitor the CPU Core load then you can observe that the load is bounced between the cores but at a time only one thread is utilizing the one core of the CPU (because of the nature of the NodeJS). – Sudhir Dhumal May 26 '19 at 13:49
  • 2. Yes most of the programming languages use the term that threads execute in parallel which is true. But when you have threads loaded with set of instruction which needs good amount of CPU time and hardware is not capable to process all threads at a time then in that case threads share the time. – Sudhir Dhumal May 26 '19 at 13:49
  • 3
    @Abdelfattah can you please mark the answer as accepted if it is what you been asking about. – Sudhir Dhumal Oct 20 '19 at 18:48
  • 3
    @SudhirDhumal you didn't really answer OP's question specifically. Also OP has commented again for clarifying that, you haven't responded to it. This is obvious to me on reading the responses in this post. – divine May 16 '20 at 10:05