0

I am running the same asynchronous function in a loop without awaiting the function in a system with 2 CPUs. When I logged to the process id from inside the function, all of them have the same process id,

How does Node handle such parallel executions? Are both the CPUs being using perfectly? Do I need to manually fork process for each function in the loop?


function next(){
 console.log('main started', process.$
 const arr=[];
  for(let i=0; i<10000000; i++)    
                arr.push(1);
  arr.sort(function(a, b){return a<b});
 console.log('main ended')
}

function main(){
 next();
 next()
 next();
 next();
 next();
 next();
 console.log('-----------------------------$
}

main()


HTOP Screenshot

  • 1
    This might help: https://blog.risingstack.com/node-hero-async-programming-in-node-js/ – crashmstr Sep 06 '19 at 16:29
  • Show us your actual code in the loop and what the asynchronous operation is and we can answer a lot more specifically. node.js runs your Javascript as single threaded, but uses threads for some asynchronous operations (file I/O) and uses native asynchronous support for other types of operations (networking) and uses processes for things like `spawn()` and `exec()`. Show us the code for what you're actually doing so we can explain. – jfriend00 Sep 06 '19 at 16:51
  • Please don't just post and disappear. That's not how stackoverflow works. When people engage with you to help, you will get better help if you are there to interact in a reasonable amount of time. – jfriend00 Sep 06 '19 at 19:10

2 Answers2

2

Node.js runs your actual Javascript in a single thread so it does not apply more than one CPU to your actual Javascript unless you specially design your code to put the CPU intensive tasks in Worker threads or you create your own separate processes with clustering or by using the child_process module to fire up your own additional processes and then farm work out to them. Just running your node.js program, a CPU intensive operation (like your long loop of sorting) will hog the CPU and block the event loop from processing other requests. It will not involve other CPUs in doing that sorting operation and will not use other CPUs for your Javascript.

When you run an asynchronous operation, there will be native code behind that operation and that native code may or may not use additional threads or processes. For example, file I/O uses a thread pool. Networking uses native OS asynchronous support (no threads). spawn() or exec() in child_process start new processes.

If you show us the actual code for your specific situation, we can answer more specifically about how that particular operation works.

How does Node handle such parallel executions?

It depends upon what the operation is.

Are both the CPUs being using perfectly?

Probably not, but we'd need to see your specific code.

Do I need to manually fork process for each function in the loop?

It depends upon the specific situation. For applying multiple CPUs to your actual Javascript (not asynchronous operations), then you would need multiple processes running your Javascript or perhaps the newest Worker thread api. If the parallelism is all in asynchronous operations, then the event driven nature of node.js is particularly good at managing many asynchronous operations at once and you may not even benefit from getting multiple CPUs involved because most of the time all node.js is doing is waiting for I/O to complete and many, many requests can already be in flight at the same time very efficiently in node.js.

For actually getting multiple CPUs applied to running your Javascript itself, then node.js has the clustering module which is pretty purpose-built for that. You can fire up a cluster process for each actual CPU core in your computer. Or, you can also use the new Worker thread api.

Also see these answers that discuss how to address CPU intensive code in node.js:

How to process huge array of objects in nodejs

How to apply clustering/spawing child process techniques for Node.js application having bouth IO bound and CPU bound tasks?

node.js socket.io server long latency

Is it possible somehow do multithreading in NodeJS?

How cpu intensive is too much for node.js (worried about blocking event loop)

jfriend00
  • 683,504
  • 96
  • 985
  • 979
  • Thank you for your response. I have updated the post with the code – Zeeshan Shamsuddeen Sep 07 '19 at 05:09
  • @ZeeshanShamsuddeen - That's not real code and doesn't show any real asynchronous operation. As I've said multiple times, strategies for parallelism depend upon the specific asynchronous operation. You will always get better answers here if you show your ACTUAL code, not pseudo code, not theoretical code, not theoretical questions with no code. – jfriend00 Sep 07 '19 at 05:21
  • Apologies because I am new to StackOverFlow. I have edited the post with the actual code. Please have a look into this. – Zeeshan Shamsuddeen Sep 07 '19 at 05:45
  • @ZeeshanShamsuddeen - Neither of those operations are CPU intensive or block the single thread of Javascript execution. The axios call is just a network request which uses no CPU at all while waiting for the response. The database call is also just a networking operation to another process (the database). There's no need to engage any other processes for that loop. Your performance will be limited by the ability of the target servers to respond, not by your node.js process. – jfriend00 Sep 07 '19 at 05:55
  • @ jfriend00: Thank you. If the loop count is 100. This I am hitting an API 100 times and waiting for its response. Now, when all the responses comes back in parallel, will Node handle this perfectly? Will Node use both the CPUs during this time if needed? How to understand how many such parallel processes can be handled by Node? – Zeeshan Shamsuddeen Sep 07 '19 at 06:57
  • @ZeeshanShamsuddeen - Please reread the first sentence of my answer. We would need to see your code for processing the responses to provide any more detail. For example, are you writing data to a file? Are you saving data to a database? Are you sending data to another server? Those are all asynchronous operations and they make the answer different than if you are just munching on the result with your own Javascript and not doing anything else. Since you refuse to provide us your real code, I won't be commenting further. – jfriend00 Sep 07 '19 at 15:19
  • I have already edited the post which includes the full code. The function gets a data from a server and then saves that data to a mongo database. – Zeeshan Shamsuddeen Sep 07 '19 at 16:23
  • @ZeeshanShamsuddeen - That's obviously not your real code because that wouldn't even parse properly as there's no function name. But, that concept will work just fine for a loop of requests because there's basically no CPU involved in either of those asynchronous operations. Those are both networking calls where node.js uses the asynchronous OS APIs for networking which are very efficient for having lots of requests in flight at the same time. FYI, there's no point to the second `await`. It doesn't accomplish anything. – jfriend00 Sep 07 '19 at 18:02
  • @ZeeshanShamsuddeen - If you were doing a loop of 100, then you'd have to consider how either the target web site for the axios call or the database respond when you throw 100 simultaneous calls at them. Many of the larger web sites will detect abuse of their site and either fail some of your connections or slow them way down. My guess is the database will queue requests (that's up to the internal implementation of the database) since it can't really process 100 simultaneous requests efficiently. – jfriend00 Sep 07 '19 at 18:08
  • Thank you @jfriemd00. Now I understand that network calls and DB calls will not have much performance issues with Node as Node will be sitting idle till their respective response comes back. I hope I am almost right? Just to be clear, I have edited the code again and added a CPU intensive process like sort(which Node should do If I am not wrong). What will be the situation here? Will node use other CPUs as well? What will the CPU's situation during this process? – Zeeshan Shamsuddeen Sep 08 '19 at 09:29
  • @ZeeshanShamsuddeen - I added more to my answer about CPU intensive tasks. Please reread the answer. – jfriend00 Sep 08 '19 at 15:30
  • Thank you very much @jfriemd00. I have added some code to my post which I ran in my dual code server and observed HTOP. I saw that my node file uses both CPUs. So, I assume that node will use the available cores for processing CPU intensive tasks. – Zeeshan Shamsuddeen Sep 10 '19 at 04:48
  • @ZeeshanShamsuddeen - Some asynchronous operations (such as file I/O) rely on a thread pool that may involve other CPUs. But, your own Javascript is entirely single threaded and runs on only one CPU. – jfriend00 Sep 10 '19 at 04:52
  • @ZeeshanShamsuddeen - Has this answered your question? If so, you can indicate that to the rest of the community by clicking the checkmark to the left of the answer and that will also earn you some reputation points here for following the proper procedure. – jfriend00 Sep 10 '19 at 04:54
  • @jfriemd00, my above code does not involve any I/O tasks, still how does it use both the CPUs? is this expected? – Zeeshan Shamsuddeen Sep 10 '19 at 06:28
  • @ZeeshanShamsuddeen - I don't know what else node.js might be using another thread for that the OS would use a second CPU for (console logging, memory housekeeping, garbage collection, etc...). But, as I've said probably 5 times already, it runs YOUR Javascript as a single thread on one CPU unless you do the things I've mentioned already. – jfriend00 Sep 10 '19 at 06:42
1

You can use Promise.all that returns the resolved data from this asynchronous function. This way node doesn't wait for each array item to processed.

let results = await Promise.all(
array.map(arrayItem => executeAsynchronousFunctionWith(arrayItem))
);

The variable results is an array of resolved result.