1

My question is about performance in my NodeJS app...

If my program run 12 iteration of 1.250.000 each = 15.000.000 iterations all together - it takes dedicated servers at Amazon the following time to process:

r3.large: 2 vCPU, 6.5 ECU, 15 GB memory --> 123 minutes

4.8xlarge: 36 vCPU, 132 ECU, 60 GB memory --> 102 minutes

I have some code similair to the code below...

start();

start(){

  for(var i=0; i<12; i++){

      function2();    // Iterates over a collection - which contains data split up in intervals - by date intervals. This function is actually also recursive - due to the fact - that is run through the data many time (MAX 50-100 times) - due to different intervals sizes...
    }
}

function2(){

  return new Promise{

    for(var i=0; i<1.250.000; i++){       
         return new Promise{      
            function3();      // This function simple iterate through all possible combinations - and call function3 - with all given values/combinations
         }
      }   
   } 
}


function3(){
   return new Promise{ // This function simple make some calculations based on the given values/combination - and then return the result to function2 - which in the end - decides which result/combination was the best...
}}

This is equal to 0.411 millisecond / 441 microseconds pér iteration!

When i look at performance and memory usage in the taskbar... the CPU is not running at 100% - but more like 50%...the entire time? The memory usage starts very low - but KEEPS growing in GB - every minute until the process is done - BUT the (allocated) memory is first released when i press CTRL+C in the Windows CMD... so its like the NodeJS garbage collection doesn't not work optimal - or may be its simple the design of the code again...

When i execute the app i use the memory opt like:

node --max-old-space-size="50000" server.js

PLEASE tell me every thing you thing i can do - to make my program FASTER!

Thank you all - so much!

PabloDK
  • 2,181
  • 2
  • 19
  • 42
  • If you have the need to create 15M promises in a tight loop it's beginning to sound like you should restructure your application considerably, instead of focussing on how to speed up that loop. Can you provide more information on what your app is doing? Why do you need that many promises? What does `function3` do? – robertklep Jun 23 '16 at 08:01
  • First of all - im new to Node... so design errors might be very possible! In short... the program does NOT use/access DB, write to disk or any thing demanding in the many iterations... it simply works with some simple arrays/object... it simply makes a lot of calculations/analysis - on a lot of data ... which is fetched from DB - in a step before all this...the reason why - i use promises... is because node i async by design - and i though it was a good way to do it? Im used to work with sync code like .Net/C#... – PabloDK Jun 23 '16 at 08:14
  • Wrapping calculations in promises doesn't necessarily make it perform a lot better, especially when you create so many promises. If you google for _"node heavy computation"_ you may find some good pointers on how to split up calculations over different child processes, utilizing more CPU resources than a single Node process can. There are also [various modules](https://npms.io/search?term=computation+parallel) that may help you. – robertklep Jun 23 '16 at 08:25
  • Thank you and i understand...And I know there are 1000 things i can possible do...BUT i really need some body with more experience than me - to tell me - what exact solution would be the best for me... i can easily use another week with the wrong "solution", frameworks ect... i need concrete code example... – PabloDK Jun 23 '16 at 08:42
  • I understand. If you can perhaps explain the type of calculations you need to perform (I assume that the "12" and "1250000" are referring to something specific), perhaps it'll make it easier for people to help you out. – robertklep Jun 23 '16 at 08:47
  • 12 and 1.250.000 is only an example - because i made a concrete test with these numbers at the Amazon servers... i fetch millions of data/simple objects from DB, the code run through all data, splitting it into different intervals and making some analysis and calculations... thats all... i think a smarter code structure would help a lot... – PabloDK Jun 23 '16 at 08:56

1 Answers1

9

It's not that the garbage collector doesn't work optimally but that it doesn't work at all - you don't give it any chance to.

When developing the tco module that does tail call optimization in Node i noticed a strange thing. It seemed to leak memory and I didn't know why. It turned out that it was because of few console.log() calls in various places that I used for testing to see what's going on because seeing a result of recursive call millions levels deep took some time so I wanted to see something while it was doing it.

Your example is pretty similar to that.

Remember that Node is single-threaded. When your computations run, nothing else can - including the GC. Your code is completely synchronous and blocking - even though it's generating millions of promises in a blocking manner. It is blocking because it never reaches the event loop.

Consider this example:

var a = 0, b = 10000000;

function numbers() {
  while (a < b) {
    console.log("Number " + a++);
  }
}

numbers();

It's pretty simple - you want to print 10 million numbers. But when you run it it behaves very strangely - for example it prints numbers up to some point, and then it stops for several seconds, then it keeps going or maybe starts trashing if you're using swap, or maybe gives you this error that I just got right after seeing the Number 8486:

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
Aborted

What's going on here is that the main thread is blocked in a synchronous loop where it keeps creating objects but the GC has no chance to release them.

For such long running tasks you need to divide your work and get into the event loop once in a while.

Here is how you can fix this problem:

var a = 0, b = 10000000;

function numbers() {
  var i = 0;
  while (a < b && i++ < 100) {
    console.log("Number " + a++);
  }
  if (a < b) setImmediate(numbers);
}

numbers();

It does the same - it prints numbers from a to b but in bunches of 100 and then it schedules itself to continue at the end of the event loop.

Output of $(which time) -v node numbers1.js 2>&1 | egrep 'Maximum resident|FATAL'

FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - process out of memory
    Maximum resident set size (kbytes): 1495968

It used 1.5GB of memory and crashed.

Output of $(which time) -v node numbers2.js 2>&1 | egrep 'Maximum resident|FATAL'

    Maximum resident set size (kbytes): 56404

It used 56MB of memory and finished.

See also those answers:

Community
  • 1
  • 1
rsp
  • 107,747
  • 29
  • 201
  • 177
  • 1. I do not print millions of console.log... only 1... the final result... or yes - some time... ever 10.000 iteration... which there by end up to some like MAX 1000 console.log... Please remember - my program run smooth at the bigger servers... not at my laptop... it run into memory problems... because of all 'the pending' data/promises... But how do i solve this - in the best possible way? – PabloDK Jun 23 '16 at 09:27
  • @PabloDK console.log was just an example of operation that creates objects that are not collected. In any case you will get into problems in Node when you block the event loop from running and a solution is usually to divide your tasks into smaller ones. See my updated answer. If you explained more about what you are doing (computing numbers or something) and what do you want to do with them once their created (run some callback? print some statts to the console?) I may be able to say more. – rsp Jun 23 '16 at 09:37
  • I get you point... I have just added some few commets to my code... Could you please take my code as the example... how would it be possible to "split up" - without "destroying" the entire flow of the promises/resolve statements? Also isnt there any other way - to optimize the execution in a better/faster way... like "multi-threading" ...what would help a lot at a Amazon server with 36 cores ... ;-) – PabloDK Jun 23 '16 at 10:09
  • BTW... my loop structure is like this...(only look at the code) https://stackoverflow.com/questions/36155096/multiple-embedded-loops-in-nodejs/36155954?noredirect=1#comment59954493_36155954 – PabloDK Jun 23 '16 at 10:14
  • I worked at some new code...but before i show you... it suddently throws this error out of the thing air!?? immediate._onImmediate is not a function When i google its related to an error i NodeJs...?? I have no code with names like that... and i did exactly you you told me.. the code also executes for a 1-2 ´seconds ..but then i fails with that msg... – PabloDK Jun 23 '16 at 15:34
  • You just solved a problem I've been working on for days bringing a 20+ min script to a more suitable ~1:30 min (it's still a lot of work). Thank you! – DanielM Apr 09 '18 at 19:19
  • This one sounds very reasonable - I had something around 700 files to iterate, each file weight is around 150KB to 1.5MB. I iterated over the files and created a lot of promises which worked for something like 30 files. What I did is to take the files, split them into chunks of 20 and iterate for a `for ( let file of file)` - there are two loops - one for the chunls and one for the files in the chnk. It seems that the performance issues has been resolved and the memory usage was balanced. – Roy Segall Jun 15 '21 at 06:21