3

Please note that this question is not about keeping a process running if it crashes, or restarting it when a new deploy is done. This question is about how to restart without killing pending operations.

I have a very busy node application that receives a lot of hits every second. My application runs functions that take quite a while before they are returned (see: youtube upload via API).

The problem I am having is that when I deploy a new version of the app, the process gets restarted -- and as a result anything that was "pending" gets basically killed as well. This means that 10 youtube uploads are potentially killed and need to be retarted. Now:

  • Empty the event queue is basically impossible, since I could be waiting for quite a while
  • Killing the process as is is proving to be problematic

The ideal solution would be to make sure that any existing ongoing request is satisfied, but serve any new request with the newly deployed code.

A possible idea:

  • Have a master process that takes connections. This process never changes
  • When there is an update, send a signal to this process which will reload the "runner"
  • At this point, any new request will go through the updated runner

The only time when you have to restart the process itself for real is when you want to update the master one taking the connections.

Is this approach something that is "done"? Is there a module that does it? Or is it a total overkill?

UPDATE

Interesting answer: https://stackoverflow.com/a/10711410/829771 However, it's unrealistic to wait for the event loop to be empty to restart a process.

BUT there is another level of complication here: if the server has timers and for example it runs a task every 5 minutes, following what I wrote above you would end up with two of them running. So, the "obsoleted" processes must be notified with a signal and must listen for it and stop any "background" operation when they receive it. Please keep in mind that this is not theory -- I do have setInterval()s in my application

Community
  • 1
  • 1
Merc
  • 16,277
  • 18
  • 79
  • 122
  • 1
    Possible solution: You could try sending/catching a signal (like SIGTERM). When signaled, the server stops accepting client connections, but continues uploading and terminates when all uploads are finished. While that is happening, start up a new nodejs process using the new code, and have that accept connections. This assumes that two processes can use your data store at the same time, which most databases support. – Colonel Thirty Two Mar 01 '16 at 23:43
  • This is actually a very interesting solution. But... how can I be the only one even _asking_ this question? I really don't get it – Merc Mar 02 '16 at 00:11

1 Answers1

-2

As far as you're concerned all you need to gracefully shutdown are pending downloads. OS will handle anything else automatically, no point in "manually" cleaning up all internal node stuff. As it would be probably much more complicated than the app itself ;)

Just start at least 2 "worker" processes, on 2 different ports, in the main app implement some simple panel where you can start/pause them and send all "tasks" to one of them. When you'll deploy just pause one, wait till uploads are finished, then you can deploy, and then move to the second one. Additional benefit would be that you have some redundancy. If you implement some simple "ping" command you can route connections automatically if one of the process dies.

You can implement some function that'll return pending upload list with running timers, and then "main" app could automatically kill the runner. Actually if the operation that timer triggers is not atomic you should add it to list at the start and remove at the end, even if timer is still ticking that's not a problem. Put it on the list 20 seconds before it fires and you won't have problem with race condition that occurs between getting "process" list, killing process and event firing.

Bergi
  • 630,263
  • 148
  • 957
  • 1,375
Slawek
  • 762
  • 4
  • 7