5

I have a very basic http server:

require("http").createServer(function (req, res) {
    res.end("Hello world!");                      
}).listen(8080);                                 

How can I listen for server crashes so I can send a 500 status code in response?

Listening for process.on("uncaughtException", handler) works at process level, but I don't have the request and response objects.

A possible solution I see is using try - catch statements inside of createServer callback, but I'm looking if there are better solutions.

I tried listening for error event on server object, but nothing happens:

var s = require("http").createServer(function (req, res) {
    undefined.foo; // test crash
    res.end("Hello world!");                      
});
s.on("error", function () { console.log(arguments); });
s.listen(8080);                                 
Ionică Bizău
  • 109,027
  • 88
  • 289
  • 474
  • 1
    use a try/catch on the risky parts or on a line that calls the risky parts in an external function. – dandavis Sep 21 '14 at 08:47
  • @dandavis Isn't there another solution? – Ionică Bizău Sep 21 '14 at 08:48
  • if something throws in the server function, then the local _req_ and _res_ variables go bye-bye without a catch, so you couldn't call end() from outside. you might be able to push _req_ to a global, but then how would you keep track of which _req_ was which, and make sure they don't build up? oh, and trying around a function call doesn't slow down the execution of the function in V8, so try/catch need not be as bad as you might suppose. – dandavis Sep 21 '14 at 08:53
  • I don't think there's any workable answer other than a try/catch in each request handler so the code NEVER throws beyond the handler and you always have access to `res` in your `catch` handler. Since requests come in asynchronously, you HAVE to try/catch there where the request started. Here's some [good reading about error handling in node](http://machadogj.com/2013/4/error-handling-in-nodejs.html). – jfriend00 Sep 21 '14 at 09:12
  • 1
    More good reading on the topic: http://stackoverflow.com/questions/10390658/how-to-handle-code-exceptions-in-node-js and https://www.joyent.com/developers/node/design/errors and you can also install an express default error handler. – jfriend00 Sep 21 '14 at 09:19
  • @jfriend00 No, I won't use Express for that. I saw other questions regarding express, so isn't there any other solutions? – Ionică Bizău Sep 21 '14 at 09:24
  • 1
    You catch your exceptions at the correct place (in every request where you have the info available to handle it properly). That's how this works. End of question. Add proper error handling. There is no free lunch here. Ohhh and by the way, you have to make sure that there are no exceptions throw from async callbacks either because those won't get caught even by an exception handler at the request level - you'd have to catch those in the callback. – jfriend00 Sep 21 '14 at 09:26
  • Are you using Express framework for your application? – Paramore Sep 21 '14 at 14:46
  • @Paramore No, and that's why I asked the question: I want to know hot to do this with a basic server. – Ionică Bizău Sep 21 '14 at 16:10
  • 1
    You can consider use `domain` module. – Paramore Sep 21 '14 at 17:47
  • The example in [node's domain module's docs](http://nodejs.org/api/domain.html#domain_warning_don_t_ignore_errors) shows how to do this (this is **not** the express domain module). – Mike S Sep 21 '14 at 19:20
  • @MikeS And what hapens inside of domain module? `try - catch`? Maybe you can add an answer. :-) – Ionică Bizău Sep 22 '14 at 05:49

1 Answers1

9

Catching and handling the error

You can use node's built-in domain module for this.

Domains provide a way to handle multiple different IO operations as a single group. If any of the event emitters or callbacks registered to a domain emit an error event, or throw an error, then the domain object will be notified, rather than losing the context of the error in the process.on('uncaughtException') handler, or causing the program to exit immediately with an error code.

One very important thing to note is this:

Domain error handlers are not a substitute for closing down your process when an error occurs.

By the very nature of how throw works in JavaScript, there is almost never any way to safely "pick up where you left off", without leaking references, or creating some other sort of undefined brittle state.

Since you're only asking about how to respond with a 500 error, I'm not going to go in to how to deal with restarting the server, etc. like the node documentation does; I highly recommend taking a look at the example in the node docs. Their example shows how to capture the error, send an error response back to the client (if possible), and then restart the server. I'll just show the domain creation and sending back a 500 error response. (see next section regarding restarting the process)

Domains work similarly to putting a try/catch in your createServer callback. In your callback:

  1. Create a new domain object
  2. Listen on the domain's error event
  3. Add req and res to the domain (since they were created before the domain existed)
  4. run the domain and call your request handler (this is like the try part of a try/catch)

Something like this:

var domain = require('domain');

function handleRequest(req, res) {
    // Just something to trigger an async error
    setTimeout(function() {
        throw Error("Some random async error");
        res.end("Hello world!");  
    }, 100);
}

var server = require("http").createServer(function (req, res) {
    var d = domain.create();

    d.on('error', function(err) {
        // We're in an unstable state, so shutdown the server.
        // This will only stop new connections, not close existing ones.
        server.close();

        // Send our 500 error
        res.statusCode = 500;
        res.setHeader("content-type", "text/plain");
        res.end("Server error: " + err.message);
    });

    // Since the domain was created after req and res, they
    // need to be explictly added.
    d.add(req);
    d.add(res);

    // This is similar to a typical try/catch, but the "catch"
    // is now d's error event.
    d.run(function() {
        handleRequest(req, res);
    });
}).listen(8080); 

Restarting the process after an error

By using the cluster module you can nicely restart the process after an error. I'm basically copying the example from the node documentation here, but the general idea is to start multiple worker processes from a master process. The workers are the processes that handle the incoming connections. If one of them has an unrecoverable error (i.e. the ones we're catching in the previous section), then it'll disconnect from the master process, send a 500 response, and exit. When the master process sees the worker process disconnect, it'll know that an error occurred and spin up a new worker. Since there are multiple worker processes running at once, there shouldn't be an issue with missing incoming connections if one of them goes down.

Example code, copied from here:

var cluster = require('cluster');
var PORT = +process.env.PORT || 1337;

if (cluster.isMaster) {
  // In real life, you'd probably use more than just 2 workers,
  // and perhaps not put the master and worker in the same file.
  //
  // You can also of course get a bit fancier about logging, and
  // implement whatever custom logic you need to prevent DoS
  // attacks and other bad behavior.
  //
  // See the options in the cluster documentation.
  //
  // The important thing is that the master does very little,
  // increasing our resilience to unexpected errors.

  cluster.fork();
  cluster.fork();

  cluster.on('disconnect', function(worker) {
    console.error('disconnect!');
    cluster.fork();
  });

} else {
  // the worker
  //
  // This is where we put our bugs!

  var domain = require('domain');

  // See the cluster documentation for more details about using
  // worker processes to serve requests.  How it works, caveats, etc.

  var server = require('http').createServer(function(req, res) {
    var d = domain.create();
    d.on('error', function(er) {
      console.error('error', er.stack);

      // Note: we're in dangerous territory!
      // By definition, something unexpected occurred,
      // which we probably didn't want.
      // Anything can happen now!  Be very careful!

      try {
        // make sure we close down within 30 seconds
        var killtimer = setTimeout(function() {
          process.exit(1);
        }, 30000);
        // But don't keep the process open just for that!
        killtimer.unref();

        // stop taking new requests.
        server.close();

        // Let the master know we're dead.  This will trigger a
        // 'disconnect' in the cluster master, and then it will fork
        // a new worker.
        cluster.worker.disconnect();

        // try to send an error to the request that triggered the problem
        res.statusCode = 500;
        res.setHeader('content-type', 'text/plain');
        res.end('Oops, there was a problem!\n');
      } catch (er2) {
        // oh well, not much we can do at this point.
        console.error('Error sending 500!', er2.stack);
      }
    });

    // Because req and res were created before this domain existed,
    // we need to explicitly add them.
    // See the explanation of implicit vs explicit binding below.
    d.add(req);
    d.add(res);

    // Now run the handler function in the domain.
    d.run(function() {
      handleRequest(req, res);
    });
  });
  server.listen(PORT);
}

// This part isn't important.  Just an example routing thing.
// You'd put your fancy application logic here.
function handleRequest(req, res) {
  switch(req.url) {
    case '/error':
      // We do some async stuff, and then...
      setTimeout(function() {
        // Whoops!
        flerb.bark();
      });
      break;
    default:
      res.end('ok');
  }
}

Note: I still want to stress that you should take a look at the domain module documentation and look at the examples and explanations there. It explains most, if not all, of this, the reasoning behind it, and some other situations you might run in to.

Mike S
  • 41,895
  • 11
  • 89
  • 84
  • Some interesting content in this answer, however if the process is killed when using a timeout. *Their example shows how to capture the error (even when it happens in an async function)* -- only if I remove the timeout (async operation), then the 500 response comes, otherwise the process is killed by the error thrown. Is there any way to catch such errors? Imagine there is a big application and when making a request errors can appear (`foo.something`, where `foo` is `undefined`) -- sure, they are bugs, but how can we nicely handle such exceptions? Thanks! – Ionică Bizău Oct 02 '14 at 15:54
  • I'm not sure I understand what you're asking. The example code I provided should return the 500 response even though the error occurs within a `setTimeout` (you can replace the `throw` with `undefined.foo()` or other forced error and it should still work). Note that it will still terminate the process the way it's written, but the 500 response should go out first. – Mike S Oct 02 '14 at 16:03
  • Ah, I see. Awesome! However, what are the use cases when not closing the server/process, there will be issues? – Ionică Bizău Oct 02 '14 at 16:05
  • You _could_ remove the `server.close()` line and the process should stay open but, due to the way `throw` works in javascript, the process wouldn't be very stable. If you take a look at the example in the [`domain` module's docs](http://nodejs.org/api/domain.html), it shows a way to use the `cluster` module to automatically restart the process after the 500 response is sent. – Mike S Oct 02 '14 at 16:11
  • I'm aware of restarting the process since I want something stable, that will be restarted very rarely. Imagine what happened if the guys from GitHub would restart the things every 500 request... :-) – Ionică Bizău Oct 02 '14 at 16:16
  • That's why you use the `cluster` module ;-) With a cluster in place, you have **multiple** processes handling requests. If one goes down, then there are others to pick up the slack while a new process is spun up. – Mike S Oct 02 '14 at 16:19
  • Awesome, can you add this to your answer? – Ionică Bizău Oct 02 '14 at 16:32
  • Sure. New section added to answer. – Mike S Oct 02 '14 at 16:56