Understanding Try/Catch and Domains for Error Handling in Node

Question

I've been researching the proper way to handle errors in Node and have found some good answers here on StackOverflow and on NodeJS' site, such as How do I prevent node.js from crashing? try-catch doesn't work and the NodeJS docs themselves: http://nodejs.org/api/domain.html.

However, I'm left with a couple questions regarding when and where to use try/catch and/or domains. I realize this has to do with asynchronous vs synchronous code, but even inside the sample code provided on NodeJS' site regarding domains, they use a try/catch inside the domain's error handler. Could someone explain this in some detail, wouldn't the try/catch fail to catch asynchronous errors inside the error handler?

Beyond that, NodeJS' docs suggest that you should still end the process on an exception, and that's why the code from the Domain docs suggest using a cluster to fork a new child process/worker when the exception is caught. The main reason given is:

By the very nature of how throw works in JavaScript, there is almost never any way to safely "pick up where you left off", without leaking references, or creating some other sort of undefined brittle state.

Can someone explain this? What is the nature of how throw works in Javascript? Why would resources be leaking like crazy? And is it always really necessary to restart the process or kill/start a worker?

For example, I was implementing the JugglingDB ORM and at one point forgot to launch my local mysql server. I encountered an ECONNREFUSED error which crashed the process. Realizing this could happen in a production environment (DB crashes or is temporarily unavailable), I wanted to catch this error and handle it gracefully; retry the connection, maintain a state variable about the DB, and possibly handle requests by responding with a temporarily unavailable message. Try/Catch simply didn't catch the error and although I see I could use a domain, using the recommended strategy, I would be in an endless loop of killing and starting workers until the DB was back online.

JugglingDB, for whatever reason, only has a "connected" event, but doesn't have any kind of callback function which passes an error object; it tries to connect the moment you instantiate the class and it throws errors which are not caught and emitted in graceful way. This has made me want to look at other ORMs, but that still doesn't answer my questions regarding how to handle such a situation. Would it be wrong to use a domain to catch potential connection errors and handle it gracefully without starting a new process?

Here is the question/issue I posted to the JugglingDB github: https://github.com/1602/jugglingdb/issues/405, and here is the stack trace from the error produced by JugglingDB when a server doesn't exist (only occurs when the pooling option is enabled):

Error: connect ECONNREFUSED
    at errnoException (net.js:901:11)
    at Object.afterConnect [as oncomplete] (net.js:892:19)
    --------------------
    at Protocol._enqueue (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/node_modules/mysql/lib/protocol/Protocol.js:110:48)
    at Protocol.handshake (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/node_modules/mysql/lib/protocol/Protocol.js:42:41)
    at PoolConnection.Connection.connect (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/node_modules/mysql/lib/Connection.js:101:18)
    at Pool.getConnection (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/node_modules/mysql/lib/Pool.js:42:23)
    at Pool.query (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/node_modules/mysql/lib/Pool.js:185:8)
    at initDatabase (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/lib/mysql.js:62:20)
    at initializeConnection (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/lib/mysql.js:49:9)
    at Object.initializeSchema [as initialize] (/Users/aaronstorck/Sites/site/node_modules/jugglingdb-mysql/lib/mysql.js:33:5)
    at new Schema (/Users/aaronstorck/Sites/site/node_modules/jugglingdb/lib/schema.js:105:13)
    at Application.loadConnections (/Users/aaronstorck/Sites/site/core/application.js:95:40)

Process finished with exit code 8

Thank you in advance for any part of this you can help me understand! :)

Check [this](http://www.joyent.com/developers/node/design/errors) out. — Aleksei Zabrodskii, May 28 '14 at 12:31

Farid Nouri Neshat · Answer 1 · 2014-05-27T13:38:58.067

Looking at JugglingDB, and trying to connect to a non-existent mysql server on my laptop, I do get ECONNREFUSED, but it's only logged every few seconds(tries to reconnect) and doesn't crash my process. I downgrade my jugglingdb-mysql to 0.0.6, then I can replicate what you saying. And I looked at jugglingdb-mysql source and I found out if an error on connection is found it's just thrown. That's bad and not a good behavior. Although spamming logs is bad too, but throwing errors that can be handled is worse. So I suggest to upgrade to 0.0.7 or 0.0.8(latest), so you don't have to worry about this error. I just make a bug report for JugglingDB or jugglingdb-mysql to propagate errors back properly just like node-mysql does. node-mysql is very good example on how it's done.

Now let's go to how handle errors in node.js. Not all errors are thrown in node.js. For example you can do this:

require('fs').readFile('non-existent-file-yes-gimmie-a-error', function (error) { })

The callback will be called with an error, but it won't be thrown. If you are a module developer, then at cases like this, always propagate the error. Either call a callback with error, or call a error handler or emit error event on your event emitter. the older version of jugglingdb-mysql is a very bad example of how errors should be handled. If you can't catch the error with try catch, then you shouldn't throw it. Only throw when it can be catched, just like node.js core library functions. If you do require('fs').readFile() it will throw error immediately, which is catchable. But in cases where the error is found, after the function has returned(on handing of an async event), then it will call the callback with the error.

Now I'm sure you have experienced more crashes, form uncatchable thrown errors. They are most probably from event emitter error events. In node.js when an error event is emitted, if there are no handlers it will thrown. So if you want to catch errors from event emitters just add a error event. An example of this is fs.createReadStream that will return a event emitter:

require('fs').createReadStream('non-exitent-file-gimmie-a-error')

This will certainly crash the process, but if you can handle the error(for example give 404, to the http request that caused this) then add a error handler and it won't throw the error anymore:

require('fs').createReadStream('non-exitent-file-gimmie-a-error').on('error', handleError)

Aside the I/O errors, then there are type and reference errors. Those are the ones you have to be careful of handling them.

For example, you can have something like this(which you won't but just for educational purposes):

var routes = {
    '/' : function () {...},
    '/one' : function () {...},
    '/two' : function () {...}
}

require('http').createServer(function (req, res) {
    fs.open('layout.html', 'r', function (err, fd) {
        if (err) return errorHandler(err);
        var buffer = new Buffer(4000); // Lets say we know our file is 4000 bytes exatly

        fs.read(fd, buffer, 0, 4000, function (err, data) {
            if (err) return errorHandler(err);

            try {
                routes[req.url](req, res, data);
                fs.close(fd);
            } catch (e) {
                errorHandler(err);
            }
        });
    });

    function errorHandler(err) {
        res.writeHead(404);
        res.end();
    }
}).listen(1337)

Now as you see if routes[req.url] doesn't exist, an error will be throw because we are trying to call errorHandler, but the file is left open and we forgot to close it on error. If 10000 requests with wrong URL are made, then you run out of your process max open files limit. You can fix this, but putting the fs.close(fd) in the finally clause instead.

Let's imagine this, but without the try and catch, but catching the error with a global domain. In some cases, you wouldn't know the state of your program anymore, so on error you can't just decide to let the app go on, because in cases like this, it would leak file descriptors.

This problem applies to anywhere we have to write clean up code and as developers we don't always think about all the different inputs we receive and we will always make mistakes. That is why it is suggest to crash the process. You can catch the error with process.on('uncaughtException') or domain.on('error') but you have to kill the process after you have done your clean up.

But you have to be careful about this. Try to crash when you don't know where the error comes from. If you do know where is it from(like in above case, we only have one file open), then clean it it's resources up and let your process go on, because an attacker can find out how to crash your process with malicious input and DOS it.

In cases where you decide to do some stuff and then crash, make sure you set a timeout and then crash the process when the timeout happens. unref the timeout on node v0.10 to make sure it won't keep the process alive, and on v0.8 you can use a module like addTimeout, that will clear the timeout when callback called.

I was trying to make sure I had the latest version of JugglindDB and jugglingdb-mysql by pulling directly from their git repositories. It seem my assumption that usually the NPM is behind the repository was wrong in this case. Oddly jugglingdb-mysql's NPM was at 0.0.8 while the git repo was at 0.0.6, meanwhile their release branches only go up to 0.0.7. I switched to the NPM versions and now have version 0.2.14 for jugglingdb and 0.0.8 for jugglingdb-mysql. I confirmed this via the package.json file in their respective node_modules folders. My process still finishes with 'exit code 8' :/ — Aaron Storck, May 27 '14 at 21:27
I need to respond to other parts of your answer, but one thing at a time :) — Aaron Storck, May 27 '14 at 21:35
The crashing error only occurs only when the pooling option is turned on. The behavior as you reported with 0.0.8 (logging the error and automatic reconnection attemps) is working with pooling turned off however. — Aaron Storck, May 27 '14 at 21:57
That's the same behavior as what was happening in 0.0.6. The error is thrown unconditionally. The only way to catch that is to use a domain. — Farid Nouri Neshat, May 28 '14 at 02:08
With regard to domains; does it allow you to handle errors for only the section of code within the run context? Or does it handle any uncaught exceptions? Am I right to think that poor design on the part of JugglingDB is the reason that using a domain is the only way to catch the errors? Do you think it would be more wise try to find an alternate ORM or would it be trivial (and maybe a good learning experience) to try to fork & solve the problem and maybe submit a pull request? Would using a domain be a poor design choice, like a band-aid to cover the poor design of a dependency? — Aaron Storck, May 28 '14 at 06:30
Domains are meant to handle the uncaught exceptions in their context. Any uncaught error that is thrown in the context of a domain will be emitting the `error` event. I believe domains are meant to be the last solution to catch errors. Ideally, looking at how it's done in node code api, jugglingdb or jullgingdb-mysql should either accept a callback, or emit an `error` event. It seems trivial to me to fix. Make sure you talk to authors first before it. band-aid is definitely a good choice of word :) — Farid Nouri Neshat, May 28 '14 at 06:56

Understanding Try/Catch and Domains for Error Handling in Node

1 Answers1