1

I am working on a Node JS application and facing severe memory pressure issues. Following is the trend of memory usages within 60 mins of starting the application (under constant load):

enter image description here

The memory usages rises sharply to 95%, but afterwards stays fairly constant.

I took memory dump before and after but it seems there is some issue with how I captured the dumps because the difference between the two dump files is just a couple of MBs.

Here is how the heapdump looks like in chrome profiling:

enter image description here

But I am quite new to this, and not sure what I should be looking for.

Can someone please provide any pointers on how to figure out the memory issue here, how to detect the memory leaks, or how to make sense of the heapdump file mentioned here?

Node version: 0.12.14

Output for - find node_modules -type f -name "*.node" - is below:

node_modules/simpleflake/node_modules/bignum/build/Release/bignum.node
node_modules/simpleflake/node_modules/bignum/build/Release/obj.target/bignum.node
node_modules/heapdump/build/Release/addon.node
node_modules/heapdump/build/Release/obj.target/addon.node
node_modules/couchbase/build/Release/couchbase_impl.node

Output for - npm list --depth=0 - is below:

aws-sdk@2.5.3
body-parser@1.11.0
check-types@6.0.0
couchbase@2.2.2
elasticsearch@10.1.3
expect@1.20.2
express@4.14.0
jshint@2.9.3
minimist@1.1.3
mocha@2.5.3
moment@2.9.0
morgan@1.5.3
newrelic@1.30.0
request@2.53.0
simpleflake@1.0.0
underscore@1.7.0
why-is-node-running@1.2.2 (https://github.com/mindtickle/why-is-node-running.git#96f3c8da54b110e8a9a1423361d2da7c125784f6)
winston@1.0.2
winston-aws-cloudwatch@0.4.2

Also, is it possible to see if any package that I am using is responsible for memory leaks?

Thanks in advance.

Ashish Goel
  • 899
  • 1
  • 9
  • 20
  • What version of node are you using? `node --version` What native modules are you running? `find node_modules -type f -name "*.node"` – Matt Aug 30 '16 at 09:05
  • and how are you producing the physical memory graph? – Matt Aug 30 '16 at 09:06
  • Hi Matt, thanks for taking interest, I have edited the question with some more info about the node version and the packages used. And, the graph was taken from newrelic (newrelic agent is running on the server, providing real time server data) – Ashish Goel Aug 30 '16 at 10:32
  • Ok. I don't know how newrelic collects that.. Try `free` or `cat /proc/meminfo` on the system when in the full memory state. – Matt Aug 30 '16 at 11:28
  • @Matt - there is no doubt that these memory usage numbers are correct or not. System is running out of memory, the application is crashing. Furthermore, the newrelic agent has been running for a long time on this machine. – Ashish Goel Aug 30 '16 at 11:50
  • Sure, a `cat /proc/meminfo` would still be useful to see what memory is being used up. I'm just writing up a node version of [this](http://stackoverflow.com/a/25512908/1318694). The basic concept is the same though. – Matt Aug 30 '16 at 13:07

1 Answers1

7

The two main areas to look at are object retention and native leaks. This holds true for most garbage collected languages running on a VM

Most likely, there is something in your app or modules that are retaining references to objects and filling up the object space.

Next are modules that use native code and leak native memory, which won't show up in the GC object space.

Then Node.js itself may have a native leak, this is unlikely due to the large number of users, but it's always a possibility, especially with older versions of Node.js.

Application Memory and Garbage Collection

Run your application with nodes garbage collection logging turned on

node --trace_gc --trace_gc_verbose app.js

This will provide blocks of information on each GC event. The main info is the first line which tells you how much memory Node.js was using before and -> after the GC.

[97577:0x101804a00]    38841 ms: Scavenge 369.0 (412.2) -> 356.0 (414.2) MB, 5.4 / 0 ms [allocation failure].

The --trace_gc_verbose gives you all the lines after this with more detail on each memory space.

[97577:0x101804a00] Memory allocator,   used: 424180 KB, available: 1074956 KB
[97577:0x101804a00] New space,          used:    789 KB, available:  15334 KB, committed:  32248 KB
[97577:0x101804a00] Old space,          used: 311482 KB, available:      0 KB, committed: 321328 KB
[97577:0x101804a00] Code space,         used:  22697 KB, available:   3117 KB, committed:  26170 KB
[97577:0x101804a00] Map space,          used:  15031 KB, available:   3273 KB, committed:  19209 KB
[97577:0x101804a00] Large object space, used:  14497 KB, available: 1073915 KB, committed:  14640 KB
[97577:0x101804a00] All spaces,         used: 364498 KB, available: 1095640 KB, committed: 413596 KB
[97577:0x101804a00] External memory reported:  19448 KB
[97577:0x101804a00] Total time spent in GC  : 944.0 ms

Graphing these values normally looks like a sawtooth. This "sawtoothing" happens on multiple levels as different memory spaces fill, hit a limit, then are garbage collected back down. This graph from Dynatraces About Performance blog post on Understanding Garbage Collection and hunting Memory Leaks in Node.js shows an app slowly increasing it's object space looks like a sawtooth

Over time you should see which memory areas grow and which don't and that can provide you some context for what code is keeping object references. If you don't see any memory growth in the heap you may have a real memory leak, either in native modules or Node.js itself.

Process Memory

The memory a node application is reporting as used by the garbage collector, called the "heap", doesn't always match what the OS is reporting for the process. The OS can have a larger amount of memory allocated that Node.js might not currently be using. This can be normal, if the OS isn't under memory pressure and Node.js garbage collects objects there will be a difference as the OS retains the extra memory space for the process. This can also be abnormal when something is leaking and the allocated process memory continually grows.

Gather some OS memory information along with your app

node --trace_gc app.js &
pid=$!

while sleep 15; do
  pss=$(sudo awk '/^Pss/{p=p+$2}END{print p}' /proc/$pid/smaps)
  echo "$(date) pss[$pss]"
done

This will give you a date and memory value in bytes in line with your GC output to compare:

[531:0x101804a00]    12539 ms: Scavenge 261.6 (306.4) -> 246.6 (307.4) MB, 5.0 / 0 ms [allocation failure].
Tue Aug 30 12:34:46 UTC 2016 pss[3396192]

The Pss figure in smaps takes account of shared memory duplication so is more accurate than the Rss figure used by ps

Heapdumps

If you do find the garbage collection log memory matching the OS process memory growth then it's likely the application, or a module it depends on is retaining references to old objects and you will need to start looking at heap dumps to identify what those objects are.

Use Chrome dev tools and run your app in debug mode so you can attach to it and create a heap dumps.

node --inspect app.js

You will see a little Node icon top left when a Node process is detected, click that to open a Devtools window dedicated to the node app. Take a heap snapshot just as the app has started. Somewhere in the middle of the memory growth. Then close to when it's at the maximum (before crashing!).

Compare the 3 snapshots looking for large objects or heavily repeated object references in the later two dumps compared to the first.


Native leaks

If the GC logs don't show an increase in memory but the OS does for the node process, it might be a native leak.

OSX has a useful dev tool called leaks that can find unreferenced memory in a process without full debugging. I believe valgrind can do the same type of thing with --leak-check=yes.

These might not identify what is wrong, just confirm the problem is a native leak.

Modules

To find native modules in your app use the following command:

find node_modules -type f -name "*.node"

For each of these modules, produce a small test case of how each module is used in your application and stress test it while checking for leaks.

You can also try excluding or disabling any of the native modules from your application temporarily to rule them out or highlight them as a cause

Node.js

If possible, try a different release of Node.js. Move to the latest major version, latest minor/patch, or back a major version to see if anything improves or changes in the memory profiles.

There are vast improvements in general memory management in the v4.x or v6.x Node.js releases from v0.x. If you can update, the issue might go away via a bug fix or a V8 update.

Matt
  • 68,711
  • 7
  • 155
  • 158
  • Thank's for citing - here's the full blog post that covers techniques and sample code to track down memory issues: http://apmblog.dynatrace.com/2015/11/04/understanding-garbage-collection-and-hunting-memory-leaks-in-node-js/ – DanielKhan Sep 07 '16 at 11:05
  • 1
    Great post. Sorry, I got so involved in trying to make the image link to the post url in the so editor I forgot to add the text too. – Matt Sep 07 '16 at 20:56