4

I need to clarify something about Node.js, Promises, CPU and performance.

To set up a context, I will talk about an asynchronous processing (database query) executed multiple times (in a loop) and then do something else only after all async. processings are done.

Let's start with a code example :

async function databaseQuery() {
    return await connection.query('SELECT * FROM example;');
}

I want to perform n times an async call (databaseQuery function), and when these n executions are over, do something else.

Let's use parallels Promises to achieve this goal :

const array = [...]; // Assuming this array is full of whatever

const promises = array.map(async (item) => {
    return await databaseQuery();
});

await Promise.all(promise);

// Ok now I'm sure all async calls are done

I tried an implemention of this code in two environnements :

  • Local Machine, Windows 10 x64, Intel i7 6c/12t , 16g RAM
  • Remote server (virtualized server @ OVH), Ubuntu 16.04, 1vCore, 6g RAM

Obviously, performance on the local machine are far away better than on the remote one (< 1 second vs. > 1 minute).

But I need some precisions on Why ?. I'm aware that physical materials are way better on the local machine.

Moreover, as Node.js is running on a single thread on a single core why the Windows resources monitor the Node.js process using 10 threads and 6 processors ?

enter image description here

There is no "multi-processing" code implemented in the code (using cluster for example).

If I want to put this code in production, should I pay attention of the processor number of cores, or it is a Windows internal process management ?

Please help me clarify this situation, I really want to understand how does this work in the background, as it will help to choose right configuration to run this type of code.

jbrtrnd
  • 3,815
  • 5
  • 23
  • 41
  • 1
    `promises = array.map(databaseQuery);` presumably? – Roamer-1888 Jul 16 '18 at 13:24
  • maybe helpful for your future development https://www.netguru.co/blog/node.js-is-going-multithread-the-future-of-javasripts-backend-framework – BraveButter Jul 16 '18 at 14:49
  • Your code run in a single thread (the Main Loop), but Node creates other threads in order to delegate them all its async I/O requests. It's possible that also the `connection.query` do something that creates some child process. What library is it? – Luca Rainone Jul 16 '18 at 15:02
  • Every program is going to show a few other threads in the resource monitor because the operating system gives the program a few threads for communication purposes or something, I don't really know what they are used for tbh. Even a hello world program written in C will have these extra threads. Also like Luca Rainone said, Node also has a few threads it uses for I/O. But your application code is always on the main thread. The only thing Node gives you in terms of "concurrency" is switching context when a function is waiting for I/O to return. – Chris Rollins Jul 16 '18 at 15:06
  • @LucaRainone `connection.query` is some TypeORM code with an extra layer. To respond at my last question, should I pay extra attention to the number of core of my hardware environment ? – jbrtrnd Jul 16 '18 at 19:04
  • @ChrisRollins I understand, so the number of core shouldn't affect performance ? Only the processor in itself ? – jbrtrnd Jul 16 '18 at 19:08
  • You aren't running a lot of CPU heavy code on the node application, right? So the performance should be related to I/O, so it probably has very little to do with the CPU. – Chris Rollins Jul 17 '18 at 03:55
  • Well no I guess I shouldn't say it has very little to do with the CPU. I suppose that the CPU performance of the database matters a lot in this case. – Chris Rollins Jul 17 '18 at 04:20
  • @jbrtrnd As said: nodejs use multiple threads in order to do I/O tasks. In addition you, or some library that you use, can use child process in order to do some heavy operation (like crypt/decrypt, zip/unzip, big data manipulation and so on). So, more cpu means that your application could be more performant, though your main loop is single threaded. It depend by your tasks and how you perform them. – Luca Rainone Jul 17 '18 at 07:30
  • I checked the number of threads of each CPU. It's 1 for the remote machine and 12 for the local one. So, "in theory", if I have one processor with 1 core, capable to manage 12 threads, I can approach the same perfs (it's just a theory sentence, I mean more threads, more perfs) – jbrtrnd Jul 17 '18 at 07:39
  • Cuz I run N database query (I/O tasks), Node.js will internally use multiple threads (I don't think librairies used are using child process) so process will be able to "fake" (because of the large number of threads) as parallel tasks. So it's not really dependent of Node.js core, but more on libuv and ecosystem core/thread usage ? – jbrtrnd Jul 17 '18 at 07:49
  • 1
    @jbrtrnd the most time spend for database queries is not in node but in the DBMS, node will send the querie using a socket to the DMBS and receives the data through the socket. There is no need or advantage at this point using multible threads for nodejs because this can be done event based. – t.niese Aug 26 '18 at 18:04

1 Answers1

7

Node.js runs the JavaScript you put into it in a single thread, but it has many more threads performing various tasks for I/O and the microtask queue. At the time of me writing this answer, Node.js will always have at least 7 threads: 4 for the libuv event loop[1], 4 for running background tasks for V8[2], and 1 for scheduling delayed background tasks for V8.

Your DB library, other addons, or misc things in Node.js core may be creating additional threads for various reasons.

This shows up as multi-core utilization because (in simplified terms) CPUs will pass threads around cores.

As a side note, return await is completely unneeded, you can return promises from async functions and they will be unwrapped because promise resolution is a flat-map operation.

[1] libuv handles I/O, for example reading from files, making tcp sockets, and scheduling timers.

[2] V8 background tasks include running the garbage collector and optimizing code.

snek
  • 1,980
  • 1
  • 15
  • 29