6

Currently working on optimizing a library for speed. I've already reduced execution time drastically, using V8 CPU and Memory Profiling through Webstorm. This was achieved mainly by changing the core method from recursive to iterative.

Now the self time distribution breaks down as

enter image description here

I'm assuming the first entry "node" is timing internal functions calls, which is great. The other entries also make sense. I'm new to Nodejs profiling, but 31.6% for GC seems high, so I've decided to investigate.

I've now created a heap dump through Webstorm, but unfortunately that doesn't give me much information.

enter image description here

These seem to be system internal memory references mainly. Stepping through the core iteration code logic again, there also don't seem to be a lot of places where memory is explicitly allocated (using this as a reference).

Question

  • Can the GC overhead be reduced?
  • Is this amount of allocation just expected here?
  • Is it possible to get better memory profiling information?

Setup Instructions

In case someone want's to try debugging this, I'm including setup instructions.

Download or clone object-scan and run

yarn install --frozen-lockfile
yarn run test-simple --verbose

Now create a file test.js in the project root containing this content and run node --trace_gc test.js or run it through Webstorm for advanced profiling.

vincent
  • 1,953
  • 3
  • 18
  • 24

1 Answers1

5

In Javascript and in v8 (node) particularly an amount of time spent for garbage collection depends on amount of data stored in heap, but that's only one of many factors.

In v8 engine there are two main "types" of GC: minor (scavenge) and major (mark-sweep/mark-compact). You may see GC types that happen during your tests in console with --trace-gc enabled. And in different cases one type could "eat" more time than other an vice versa. So before optimizations you should determine which gc takes more time.

There are not a lot of options for optimizing major GC, cause it highly affected by amount of data that stays in memory for "long" (actually in this case long means that object survives scavenge GC) period. Such data is stored in so called "old space" in heap. And major GC works with this space and it should scan all that memory and mark objects that no longer have any references for further clearance.

In your case the amount of test data you're loading goes to old space. As a result it affects major GC during the whole test. And in this case major GC will not clear too much, because you're using your test object, but it still consume time for scanning entire old space. So you may consider preventing v8 from doing that by launching node with gc-specific flags like: --nouse-idle-notification --expose-gc --gc_interval=100500 (where 100500 is number of allocation, it can be take high value that will prevent running gc before the whole test will pass) that will allow trigger garbage collections manually. Test your code using this approach and see how major GC affects it, try tests with different amount of data you provide to function. If the impact is quiet high you may try to refactor your code trying to minimize long-lived variables, closures, etc.

If you'll discover that major GC doesn't have much impact on performance, then scavenge GC takes the most of time. Unlike major GC it operates with so called "new space" in heap. It's a space where all new objects are stored. If those objects survive scavenge, then they are moved to old space. New space has much smaller size ( you may control it by setting --max_semi_space_size, note: new space size = 2 * semi space size) than old space and more new objects and variables you allocate more scavenge GC runs will happen. If this GC heats performance too much you may consider refactor your code to make less new allocations. But if you'll reuse variables it may also slowdown the performance and those objects will go to old space and may become a problem described in "major GC" section.

Also v8 GC doesn't always work in the same thread that your program runs. It does some work in background too, but I don't know what Webstorm shows in your case. If it counts just total time spend in GC, may be it just doesn't have so much impact. You may find more details on v8 GC in this blog post.

TL;DR:

Can the GC overhead be reduced?

  • Yes, but first you should discover what should be optimized by following steps above.

Is this amount of allocation just expected here?

  • That's could be just discovered by comparing different approaches. There's no some absolute number that could limit "good" amount from "bad", because it depends on lot's of factors, including the amount on entry data.

Is it possible to get better memory profiling information?

  • You may find some good tools here, but in general you may use Chrome dev tools which could provide a bit more details rather than Webstorm does.
m03geek
  • 2,508
  • 1
  • 21
  • 41
  • Thank you very much for this comprehensive answer. I've learned a lot! These seem to be mainly scavenge GC invocations. By increasing `max_semi_space_size` I've managed to improve runtime by another ~27%. I'm already re-using variables as much as I can and fail to see where the allocation would happen. Do you have any tips / strategy for narrowing that down? – vincent May 05 '19 at 14:55
  • I've ended up using `node --expose-gc --inspect=9222 test.js` and then in chrome `chrome://inspect`. That worked reasonably well and I was able to find the relevant locations in the code. Cheers again! – vincent May 05 '19 at 17:14
  • 1
    The simplest example is `let` in for loops. If you declare it inside a loop, it will be scoped and recreated each time. Declaring it outside will make it not scoped, so it will be created only once, but obviously you'll loose all benefits that scoped variables grant. – m03geek May 06 '19 at 20:23
  • Great overview, thanks @m03geek ! Is there anywhere official where the `gc_interval` argument is documented (I couldn't find)? Does it affect both scavenge and m-s-c ? I've read elsewhere that `nouse-idle-notification` is now ignored by node; is that so? – logidelic Jun 25 '20 at 23:54