26

If you look at the beginning of the Node.js documentation for domains it states:

By the very nature of how throw works in JavaScript, there is almost never any way to safely "pick up where you left off", without leaking references, or creating some other sort of undefined brittle state.

Again in the code example it gives in that first section it says:

Though we've prevented abrupt process restarting, we are leaking resources like crazy

I would like to understand why this is the case? What resources are leaking? They recommend that you only use domains to catch errors and safely shutdown a process. Is this a problem with all exceptions, not just when working with domains? Is it a bad practice to throw and catch exceptions in Javascript? I know it's a common pattern in Python.

EDIT

I can understand why there could be resource leaks in a non garbage collected language if you throw an exception because then any code you might run to clean up objects wouldn't run if an exception is thrown.

The only reason I can imagine with Javascript is if throwing an exception stores references to variables in the scope where the exception was thrown (and maybe things in the call stack), thus keeping references around, and then the exception object is kept around and never gets cleaned up. Unless the leaking resources referred to are resources internal to the engine.

UPDATE

I've Written a blog explaining the answer to this a bit better now. Check it out

Asad Saeeduddin
  • 46,193
  • 6
  • 90
  • 139
Justin Warkentin
  • 9,856
  • 4
  • 35
  • 35
  • You mind find this question useful http://stackoverflow.com/questions/14301839/javascript-asynchronous-exception-handling-with-node-js – Benjamin Gruenbaum Apr 06 '13 at 18:37
  • 1
    Unfortunately that question just talks about using domains to catch asynchronous exceptions. It doesn't mention memory leaks at all. – Justin Warkentin Apr 07 '13 at 05:11
  • 1
    Which is why it was a comment and not an answer :) It's how to make domain try/catching easier to work with. As for the question, it's about closure leakage. You throw an exception, but you have for example a request object that still has a reference in an event, and the event has reference to the request object and the two are not garbage collected. For example if you have a MongoDB connection which you don't close because an exception is thrown, it might implicitly stay open. – Benjamin Gruenbaum Apr 07 '13 at 05:20
  • Good point. So if that's the only issue though, then it sounds like resource leakage is limited to exceptions thrown in sections of code where there are open streams (sockets, files, etc). If that's true, I wish they'd explain it better, because then programmers can take it into account. It sounds like Javascript needs something like Python's context managers. – Justin Warkentin Apr 07 '13 at 05:28
  • Although, the garbage collector is built to handle cyclical references, so once any open streams go out of scope and the only references left to it are cyclical I would think it should be able to detect that and automatically close the stream. – Justin Warkentin Apr 07 '13 at 05:30
  • Do you mean something like python's "with" or C#'s "using" ? If so, this is sort of what domains address. I think they provide a very elegant solution to a hard problem. How is this handled in python with twisted? – Benjamin Gruenbaum Apr 07 '13 at 05:49
  • Yes, like Python's 'with'. The thing about context managers is that when you write one you can define code that runs no matter what when exiting the context. So, even if an exception is thrown within the 'with' block, the stream will still be closed, avoiding resource leakage. I don't know enough about domains to really understand how they solve this problem yet, but if they do then I'm curious why they would still recommend shutting down the program on any exception. – Justin Warkentin Apr 07 '13 at 05:57

3 Answers3

14

Unexpected exceptions are the ones you need to worry about. If you don't know enough about the state of the app to add handling for a particular exception and manage any necessary state cleanup, then by definition, the state of your app is undefined, and unknowable, and it's quite possible that there are things hanging around that shouldn't be. It's not just memory leaks you have to worry about. Unknown application state can cause unpredictable and unwanted application behavior (like delivering output that's just wrong -- a partially rendered template, or an incomplete calculation result, or worse, a condition where every subsequent output is wrong). That's why it's important to exit the process when an unhandled exception occurs. It gives your app the chance to repair itself.

Exceptions happen, and that's fine. Embrace it. Shut down the process and use something like Forever to detect it and set things back on track. Clusters and domains are great, too. The text you were reading is not a caution against throwing exceptions, or continuing the process when you've handled an exception that you were expecting -- it's a caution against keeping the process running when unexpected exceptions occur.

Eric Elliott
  • 4,711
  • 1
  • 25
  • 33
  • Thanks for the explanation. I actually use forever to manage my processes already. I have some servers that use Socket.io and maintain a number of active websockets. I was concerned about having to shutdown the server if there is an exception because that means disconnecting all the other clients. I try to handle every exception I can and I've removed my global exception handler. I just use forever if it actually crashes for any reason. Thanks again! – Justin Warkentin Apr 08 '13 at 14:29
  • 2
    You shouldn't just remove your global exception handler. It still can be very useful to log the exception before you exit. – laktak Apr 08 '13 at 15:13
  • @chris That's true, but since I'm using forever to manage my logs, by not handling the exception it crashes, dumps out the stack trace, and then forever just restarts it. So I still have it logged and can go back and review the stack trace. – Justin Warkentin Apr 10 '13 at 15:22
  • +1 for the Unexpected exceptions. is there any demos on this unexpected exceptions which can cause those bizzarre behaviors? heard it a lot, never saw it.. – bpceee Oct 22 '14 at 10:53
  • I don't have the time to create one at the moment, but you could create one for yourself simply by instantiating literally anything (try reading in some big files), and then cleaning up references in a success handler. Then create a condition that throws an error. For example, middleware that throws when you pass a param called "err". "Forget" to handle the error case and clean up the file refs. Boom. Memory leak. – Eric Elliott Oct 31 '14 at 23:54
12

I think when they said "we are leaking resources", they really meant "we might be leaking resources". If http.createServer handles exceptions appropriately, threads and sockets shouldn't be leaked. However, they certainly could be if it doesn't handle things properly. In the general case, you never really know if something handles errors properly all the time.

I think they are wrong / very misleading when they said "By the .. nature of how throw works in JavaScript, there is almost never any way to safely ..." . There should not be anything about how throw works in Javascript (vs other languages) that makes it unsafe. There is also nothing about how throw/catch works in general that makes it unsafe - unless of course you use them wrong.

What they should have said is that exceptional cases (regardless of whether or not exceptions are used) need to be handled appropriately. There are a few different categories to recognize:

A. State

  1. Exceptions that occur while external state (database writing, file output, etc) is in a transient state
  2. Exceptions that occur while shared memory is in a transient state
  3. Exceptions where only local variables might be in a transient state

B. Reversibility

  1. Reversible / revertible state (eg database rollbacks)
  2. Irreversible state (Lost data, unknown how to reverse, or prohibitive to reverse)

C. Data criticality

  1. Data can be scrapped
  2. Data must be used (even if corrupted)

Regardless of the type of state you're messing with, if you can reverse it, you should do that and you're set. The problem is irreversible state. If you can destroy the corrupted data (or quarantine it for separate inspection), that is the best move for irreversible state. This is done automatically for local variables when an exception is thrown, which is why exceptions excel at handling errors in purely functional code (ie functions with no possible side-effects). Likewise, any shared state or external state should be deleted if that's acceptable. In the case of shared state, either throw exceptions until that shared state becomes local state and is cleaned up by unrolling of the stack (either statically or via the GC), or restart the program (I've read people suggesting the use of something like nodejitsu forever). For external state, this is likely more complicated.

The last case is when the data is critical. Well, then you're gonna have to live with the bugs you've created. Everyone has to deal with bugs, but its the worst when your bugs involve corrupted data. This will usually require manual intervention (reconstructing the lost/damaged data, selectively pruning, etc) - exception handling won't get you the whole way in the last case.

I wrote a similar answer related to how to handle mid-operation failure in various cases in the context of multiple updates to some data storage: https://stackoverflow.com/a/28355495/122422

Community
  • 1
  • 1
B T
  • 57,525
  • 34
  • 189
  • 207
  • Thanks for the answer. I very much agree. I think the documentation is misleading at best, and very indescriptive on the matter. After first seeing the docs I was thinking, why would they provide a way to throw and catch exceptions if there's something inherently wrong with the way it works in the language? It's a common pattern and other languages seem to do it just fine. I don't think there was a way for `http.createServer()` to handle exceptions properly and cleanup before domains. I'm still convinced that Node.js needs Python style context managers. – Justin Warkentin Apr 10 '13 at 15:37
  • 1
    BTW, I wrote up a [blog](http://www.teknically-speaking.com/2013/04/javascript-and-nodejs-exceptions-and.html) on all this last night and I just updated it and linked to your answer here. – Justin Warkentin Apr 10 '13 at 17:06
  • It's also not uncommon for server code to be stateless (or "functional", as the answer mentions), since that's required if you want to be able to stand up multiple copies of the server to handle extra traffic. If that's the case, I really don't see an issue with letting Node keep running when an error occurs. As long as everyone uses `finally` the way you're supposed to, it should be fine. – Scotty Jamison Mar 01 '23 at 16:21
8

Taking the sample from the node.js documentation:

var d = require('domain').create();
d.on('error', function(er) {
  // The error won't crash the process, but what it does is worse!
  // Though we've prevented abrupt process restarting, we are leaking
  // resources like crazy if this ever happens.
  // This is no better than process.on('uncaughtException')!
  console.log('error, but oh well', er.message);
});
d.run(function() {
  require('http').createServer(function(req, res) {
    handleRequest(req, res);
  }).listen(PORT);
});

In this case you are leaking connections when an exception occurs in handleRequest before you close the socket.

"Leaked" in the sense that you finished processing the request without cleaning up afterwards. Eventually the connection will time out and close the socket, but if your server is under high load it may run out of sockets before that happens.

Depending on what you do in handleRequest you may also be leaking file handles, database connections, event listeners, etc.

Ideally you should handle your exceptions so you can clean up after them.

laktak
  • 57,064
  • 17
  • 134
  • 164