Issues with running clusters with Nodejs

Question

I read through this article and it kinda hit me -- what could be the drawbacks with running multiple workers with Node? This is all server side, so we know how many cores our server has, therefore we know what it can handle. But surely there must be drawbacks to this? I cannot seem to find anything online regarding this.

One drawback is the # of cores.. the more that are spun up, the larger the chance others can crash. Also, delegating computational power like this, with enough workers, can make your backend ultimately less efficient — displayName, Aug 17 '19 at 16:30
Also, take a look here https://stackoverflow.com/questions/2387724/node-js-on-multi-core-machines/28348308#28348308 — displayName, Aug 17 '19 at 16:32
@displayName - Where do you get the notion that running a second core increases the odds of the first core crashing? — jfriend00, Aug 17 '19 at 16:56

jfriend00 · Accepted Answer · 2019-08-17T16:49:45.977

A meaningful drawback to clustering is that it can increase your code complexity. Any state your server keeps in local memory cannot be directly accessed by other servers. So, look at something that's relatively simple like the default implementation of express-session. It keeps your session in the local server's memory. If you do that with clustering, then that creates problems. The first request that establishes the session goes to clusterProcessA and the session object is established in that processes memory. The second request from that same client gets routed to clusterProcessB. Hmm, there's no session for that client in that clusterProcess so it creates a new session and stores it locally in the memory of clusterProcessB. Obviously this doesn't work.

To solve this, clients either have to be made sticky (so they are always sent to the same clusterProcess) or the session data has to be stored in a shared database and always accessed from there by all clustered processes.

These are completely solvable problems (sharing session data between clusters), but they add meaningful complication to both the coding and deployment. And, now if you move all the state to a single shared database that all the clustered servers access, then you may have just created a new bottleneck in your scalability and you may now need to cluster or otherwise scale up your database so it can keep up.

It's worth adding that there are some alternatives to clustering for helping with scalability. If your main bottleneck to scalability is a manageable set of operations that happen to be a bit too CPU-intensive (imagine some image processing), then you can move those specific operations (just the CPU-intensive parts) out of the main node.js process by farming it out to a set of worker child processes and perhaps a job queue.

These keeps you with one core server process and you can still have things like session state in memory, but it helps with scalability by moving the operations that were taking the most CPU into other processes where you can more advantageously uses all the CPUs. With lots of CPU-intensive work to do, this can even create a more responsive server than clustering because the main server is kept relatively free to just respond to requests and handle networking and the more CPU-intensive stuff is all farmed out to child processes where it doesn't interfere node's single threaded responsiveness.

Issues with running clusters with Nodejs

1 Answers1