0

I've problem with nodejs memory usage. I not sure is it something relates to Garbage Collector.

Below is the monitoring in the pass 6 hours, we can see the usage growth and drop pattern are almost identical in every hour. I've check for past day or 3 days, the pattern are almost the same.

enter image description here

enter image description here

Observation of the cycle:

  1. The release time "lowest point" are always at the 42-minute of the hour.
  2. Then the usage continue grow and have a second time of memory release at 05-minute of the hour.
  3. then it continue to increase to the peak at 40-minute of the hour, then it back to the item 1.

The Application:

  1. this application serve a few CRUD API, and also serve some reverse-proxy request.
  2. this application has a few "cron", which is using setInterval and run every 60 seconds, to call 3rd party API to pull some data.
  3. I've check through all my .js file, it doesn't seems to have global variable that store incremental data, except for some config data that only run during the application starts.
  4. meaning the cron won't update the global variable, but instead, it will temporary update data to the variable within the setInterval scope, and push it to DB when the task ended. (I didn't purposely clear the variable within setInterval scope after push the data to DB as i expect GC will clear it automatically.)

I've checked a few post but it seems like the GC usually run with a few ms, not suppose to drag for an hour to clear the unused object?

I need some advise here.. thanks..

EDIT 1: I found the nodejs memory leak post which shows the similar symptom. I implemented the following method of using setInterval and currently is monitoring the effect:

function b() {
    var a = setInterval(function() {
        console.log("Hello");
        clearInterval(a);
        b();                
    }, 50);
}
b();

Edit 2: Above solution isn't working, result still same after 6 hours of monitoring.

Edit 3: I tried to increase the interval from 60s to 120s and now it looks like more stable now. Seems like one of the reason that cause the spike was due to high frequent of setInterval that need more time to complete.

However, if it is that case, the server memory shall burst after certain of time as the needed memory to run the task are stacking, it till doesn't explain the sudden drop (release) of the memory. Need some advise here. enter image description here

Edit 4: After 24 hours of observation, it appears that it is now running stable after increase the interval time. enter image description here

Conclusion Thanks for jfriend00 pointing it out, one of the Garbage collection principle is it often triggered during periods of low activity when the JavaScript engine is idle. This minimizes the impact on the application's performance.

Increase setTimeout or setInterval interval time to let the JS engine has time to 'breath'.

Jerry
  • 1,455
  • 1
  • 18
  • 39
  • So, this isn't a memory leak because the memory usage does recover (doesn't grow forever). So, it is apparently either a very lazy garbage collector or some code you have that retains references to data longer than you think (perhaps a closure or a something like a cache in the API library). Since you don't show any code, all people can do here is offer wild guesses. If you showed the relevant code including which modules you're using, then people might have more concrete ideas. – jfriend00 Aug 21 '23 at 07:42
  • Also, is this really a problem? Or you're just curious why it behaves this way? – jfriend00 Aug 21 '23 at 07:42
  • FYI, databases routinely use available memory to improve performance (holding indexes in memory, caching data, pooling connections, etc...). So, if there's a database involved in that memory usage track, then it could certainly be part of the cause. – jfriend00 Aug 21 '23 at 07:45
  • @jfriend00 it does caused the server out of resources, i upgraded the spec so currently still manage-able. It is better to find the root cause and mitigate it. – Jerry Aug 22 '23 at 02:15
  • This server doesn't host a DB, just the nodejs app using mongo library to connect DB. – Jerry Aug 22 '23 at 02:16
  • @jfriend00 added some observation in edit 1,2 & 3. Feel free to check it out. – Jerry Aug 22 '23 at 04:05
  • If you never let your server go idle because of repeated timers that are busy the whole time, then the garbage collector never gets to do its full job as it prioritizes the running of your code until it senses an emergency in memory management. – jfriend00 Aug 22 '23 at 05:37

1 Answers1

0

Thanks for @jfriend00 pointing it out, one of the Garbage collection principle is it often triggered during periods of low activity when the JavaScript engine is idle. This minimizes the impact on the application's performance.

In my case, it was due to i have few timer that run in high frequency, 2 medium to heavy task running every 60s and 1 light task running every 20s.

Although every timer task isn't using global variable and use recursive function to minimize memory usage, it still consume high memory due to processing large set of data.

With the GC principle, it don't get a chance to clear the garbage due to the JS engine is busy with repeating on the given task.

Hence, Increase setTimeout or setInterval interval time to let the JS engine has time to 'breath', if the task isn't very time sensitive. This is part of the consideration when designing your software.

Side note: JavaScript is primarily single-threaded. This means that JavaScript code is executed in a single sequence or thread of execution .

Jerry
  • 1,455
  • 1
  • 18
  • 39
  • One way to assure there is always a pause between the completion of your timer processing and the next run of the processing is to keep track of how long it took to run your processing. Then do a `setTimeout()` to schedule the next run. That `setTimeout()` can adjust its time based on how long the first run took, but never be so small that the GC doesn't get a chance to do its job. So, you don't use just one `setInterval()` or one constant `setTimeout()`. Instead, you smartly adjust based on how long the processing took - always guarenteeing enough time for the GC to run. – jfriend00 Aug 23 '23 at 03:13
  • meaning it don't run based on schedule, but instead based on task completion? – Jerry Aug 23 '23 at 03:16
  • 1
    If you want to run it as often as possible, then yes based on task completion. But, you can also do a hybrid where you schedule the next run to be some minimum time after task completion (to let GC run) unless the task finished quickly, in which case you take your desired duration between runs and subtract off the execution time of the just completed run to then do a `setTimeout()` for that delta in time (thus getting your desired interval, but assuring always enough time for GC to run). – jfriend00 Aug 23 '23 at 03:22