0

We are running into a crash every 8 hours or so. We are running this as a service on an Ubuntu 18.04 AWS EC2 Server. The error log is as follows:

Jun  3 23:13:12 ip-<IP HERE> <PROJECT NAME>[13506]:       -x- GET / 200 4,495ms 13.38kb
Jun  3 23:13:29 ip-<IP HERE> <PROJECT NAME>[13506]:       <-- GET /
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     <--- Last few GCs --->
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     [13546:0x2c50070] 16136092 ms: Mark-sweep 1531.3 (1818.9) -> 1531.3 (1787.9) MB, 564.4 / 0.0 ms  (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 565 ms) last resort GC in old space requested
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     [13546:0x2c50070] 16136657 ms: Mark-sweep 1531.3 (1787.9) -> 1531.3 (1787.9) MB, 564.6 / 0.0 ms  last resort GC in old space requested
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     <--- JS stacktrace --->
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     ==== JS stack trace =========================================
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     Security context: 0x31dffda25891 <JSObject>
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:         1: isAlive(aka relayAlive) [/home/ubuntu/<PROJECT NAME>/server/controllers/relay.js:~331] [pc=0x2553280b906c](this=0x86baad75b99 <Object map = 0x1d125244b269>,ip=0x5d343cff359 <String[12]: 10.128.3.231>,port=0x5d343cff381 <String[2]: 80>,id=0x5d343cfede1 <ObjectID map = 0x49650a93921>)
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:         2: /* anonymous */(aka /* anonymous */) [/home/ubuntu/<PROJECT NAME>/server/controllers/cpanel.js:190] [bytecode=...
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      1: node::Abort() [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      2: 0x8cd49c [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      3: v8::Utils::ReportOOMFailure(char const*, bool) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      4: v8::internal::V8::FatalProcessOutOfMemory(char const*, bool) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      5: v8::internal::Factory::NewFixedArray(int, v8::internal::PretenureFlag) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      6: v8::internal::HashTable<v8::internal::SeededNumberDictionary, v8::internal::SeededNumberDictionaryShape>::NewInternal(v8::internal::Isolate*, int, v8::internal::PretenureFlag) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      7: v8::internal::HashTable<v8::internal::SeededNumberDictionary, v8::internal::SeededNumberDictionaryShape>::New(v8::internal::Isolate*, int, v8::internal::PretenureFlag, v8::internal::MinimumCapacity) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      8: v8::internal::HashTable<v8::internal::SeededNumberDictionary, v8::internal::SeededNumberDictionaryShape>::EnsureCapacity(v8::internal::Handle<v8::internal::SeededNumberDictionary>, int, v8::internal::PretenureFlag) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:      9: v8::internal::Dictionary<v8::internal::SeededNumberDictionary, v8::internal::SeededNumberDictionaryShape>::Add(v8::internal::Handle<v8::internal::SeededNumberDictionary>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyDetails, int*) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     10: 0xd80d96 [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     11: v8::internal::JSObject::AddDataElement(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::ShouldThrow) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     12: v8::internal::Object::AddDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::internal::Object::ShouldThrow, v8::internal::Object::StoreFromKeyed) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     13: v8::internal::Object::SetProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::LanguageMode, v8::internal::Object::StoreFromKeyed) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     14: v8::internal::Runtime_SetProperty(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/bin/node]
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     15: 0x255321f042fd
Jun  3 23:13:31 ip-<IP HERE> <PROJECT NAME>[13506]:     #033[31m[nodemon] app crashed - waiting for file changes before starting...#033[39m

I have tried adding the following as bash aliases (the nodeversion one just to make sure the aliases are working):

alias npm='node --max_old_space_size=8192 /usr/bin/npm'
alias nodeversion='node --version'

I have even tried running grunt with the --max-old-space-size flag out of desperation. The following is from our package.json:

    "start": "grunt --max-old-space-size=8192",

I am running console.log(v8.getHeapStatistics()) to spit out the heap but the heap_size_limit never goes over 1.5 GB. We have more than enough RAM to support giving node 8GBs

Here is the console logging for when we run sudo service <service name> start:

> koan@0.7.0 start /home/ubuntu/<Project Folder>
> grunt --max-old-space-size=8192
Running "concurrent:tasks" (concurrent) task
Running "nodemon:dev" (nodemon) task
Running "watch" task
Waiting...
[nodemon] 1.19.4
[nodemon] to restart at any time, enter `rs`
[nodemon] watching dir(s): *.*
[nodemon] watching extensions: js,mjs,json
[nodemon] starting `node app.js`
    Mongo Server Running..
    KOAN listening on port 17100...
    { total_heap_size: 92770304,
      total_heap_size_executable: 3670016,
      total_physical_size: 87881664,
      total_available_size: 1420343496,
      used_heap_size: 60364472,
      heap_size_limit: 1501560832,
      malloced_memory: 8192,
      peak_malloced_memory: 6104104,
      does_zap_garbage: 0 }

The service file that is running the application is this:

Description=<Project Name>
#Requires=After=mysql.service       # Requires the mysql service to run first

[Service]
ExecStart=/usr/bin/npm start
# Required on some systems
WorkingDirectory=/home/ubuntu/<Project Folder>
Restart=always
 # Restart service after 10 seconds if node service crashes
 RestartSec=10
 # Output to syslog
StandardOutput=syslog
StandardError=syslog
SyslogIdentifier=<Project Name>
User=ubuntu
#Group=<alternate group>
Environment=NODE_ENV=production PORT=17100

[Install]
WantedBy=multi-user.target

The web app runs correctly for around 4-5 hours before crashing. Any help would be greatly appreciated. Thank you!

EDIT: Worked locally on my Mac. Not live on Ubuntu 18.04. Tried with Cross-env output was:

> koan@0.7.0 start /home/ubuntu/<Project Folder>
> cross-env NODE_OPTIONS='--max-old-space-size=8192' grunt
Running "concurrent:tasks" (concurrent) task
    Running "nodemon:dev" (nodemon) task
    Running "watch" task
    Waiting...
    [nodemon] 1.19.4
    [nodemon] to restart at any time, enter `rs`
    [nodemon] watching dir(s): *.*
    [nodemon] watching extensions: js,mjs,json
    [nodemon] starting `node app.js`
    Mongo Server Running..
    KOAN listening on port 17100...
    { total_heap_size: 93294592,
      total_heap_size_executable: 3670016,
      total_physical_size: 88746312,
      total_available_size: 1420262064,
      used_heap_size: 60438832,
      heap_size_limit: 1501560832,
      malloced_memory: 8192,
      peak_malloced_memory: 6104104,
      does_zap_garbage: 0 }
Hunter A.
  • 7
  • 4

1 Answers1

0

I find the easiest thing to do is to use the NODE_OPTIONS environment variable instead:

"start": "NODE_OPTIONS='--max-old-space-size=8192' grunt"

The benefit of doing it this way is any child processes that may happen to be node will also inherit the environment variable.

Note that not all shells support this technique of setting environment variables. cross-env is a nice tool to work around that. You can add that as a dependency and then do something like this:

start: "cross-env NODE_OPTIONS='--max-old-space-size=8192' grunt"
Jacob
  • 77,566
  • 24
  • 149
  • 228
  • Tried with cross-env. Worked Locally. Did not work production. Ill post the output as an edit. – Hunter A. Jun 04 '20 at 00:33
  • In production is it running `npm run start` or something else? – Jacob Jun 04 '20 at 00:34
  • I'm not sure if `v8.getHeapStatistics()` will tell you this limit, but node does respect it; in our production environment we definitely see that environment variable working properly. – Jacob Jun 04 '20 at 00:39
  • Production is using /usr/bin/npm start – Hunter A. Jun 04 '20 at 00:40
  • At least on my local machine, ```v8.getHeapStatistics``` presents it as heap_size_limit. – Hunter A. Jun 04 '20 at 00:41
  • Odd... same version of node? – Jacob Jun 04 '20 at 00:57
  • No. Now that I look at it, Production is: ```v8.17.0``` and my local is ```v11.6.0``` – Hunter A. Jun 04 '20 at 01:00
  • That's your problem, then. The maximum `max-old-space-size` for node 8 is 1.7GiB, I believe. Newer versions of node increase that limit. – Jacob Jun 04 '20 at 01:18
  • I was able to push it up to 2.19 GB now. after updating to node 12.18.0 but it is still not giving me the 8GB I am specifying in the Node options – Hunter A. Jun 04 '20 at 01:48
  • Are you using 64-bit node? That's a prerequisite to being able to address a larger heap size. – Jacob Jun 04 '20 at 01:49
  • I am running 64 bit node. Running the ```node -p "process.arch"``` command output "x64" – Hunter A. Jun 04 '20 at 13:10
  • Here are some things to check; in the node process where you're doing `v8.getHeapStatistics`, also log out `process.env.NODE_OPTIONS` to validate that `--max-old-space-size=8192` actually got set on the process. Also, I don't know for sure, but maybe you have some limits set in your operating system on memory usage for processes or users. Check `ulimit` or https://unix.stackexchange.com/questions/44985/limit-memory-usage-for-a-single-linux-process for ideas. – Jacob Jun 04 '20 at 14:44
  • I think you might have been right about it honoring it even though it doesn't spit it out. No crash since upgrading node. Fingers Crossed it stays that way. Thanks for all your help. We are looking for Memory Leaks and the like now. – Hunter A. Jun 05 '20 at 13:13