fork a long-running process and avoid having to call waitpid to clean up the zombie?

Question

I have a long-running process (node.js) which calls fork (as part of a C++ module). This creates the new process as a child of the node.js process. However, there is nothing that will wait/waitpid for this child process, so it remains a zombie after it's terminated.

Is it possible to fork() a process without the current process being its parent, so that upon termination, it does not remain in the zombie state but is cleaned up?

If not, can I somehow indicate that I will not call waitpid on the child and don't care about when it terminates?

Failing all that, I can write/find a native module that can do the waitpid, but I need to be certain it will:

Not block the parent process (node.js)
Not leave any zombies after the module's function is called

Thanks!

Just `fork` twice. The long-running process will now have its parent (the first child) die and will be inherited by `init`, [which will reap it](http://stackoverflow.com/a/881434/721269). — David Schwartz, Feb 20 '13 at 02:29
Or on some Unixes you can set `SIGCHLD` to `SIG_IGN` and the kernel will interpret that as meaning you want the zombie to go away immediately and you're not interested in its exit status. — zwol, Feb 20 '13 at 02:31
@David: So the first fork will create child C1 (whose parent is the node.js process), and then fork C1 to create C2 which is the process I want to be inherited by init? C1 will die almost instantly, node.js then needs to waitpid on C1 (which we know will be quick), and C2 is then inherited by init. Did I understand correctly? — Pavel, Feb 20 '13 at 02:35
@Zack: There's a portable way to do that: `sigaction` with the `SA_NOCLDWAIT` flag. — R.. GitHub STOP HELPING ICE, Feb 20 '13 at 04:47

score 4 · Accepted Answer · answered Feb 20 '13 at 08:32

Here's the code I use to create a daemon. The comments describe why each step is done.

Status daemon_init(void)
{
  pid_t pid;
  int fh;

  /*-----------------------------------------------------------------------
  ; From the Unix Programming FAQ (corraborated by Stevens):
  ;
  ; 1. 'fork()' so the parent can exit, this returns control to the command
  ; line or shell invoking your program. This step is required so that
  ; the new process is guaranteed not to be a process group leader. The
  ; next step, 'setsid()', fails if you're a process group leader.
  ;---------------------------------------------------------------------*/

  pid = fork();
  if (pid == (pid_t)-1)
    return retstatus(false,errno,"fork()");
  else if (pid != 0)    /* parent goes bye bye */
    _exit(EXIT_SUCCESS);

  /*-------------------------------------------------------------------------
  ; 2. 'setsid()' to become a process group and session group leader. Since
  ; a controlling terminal is associated with a session, and this new
  ; session has not yet acquired a controlling terminal our process now
  ; has no controlling terminal, which is a Good Thing for daemons.
  ;
  ; _Advanced Programming in the Unix Environment_, 2nd Edition, also
  ; ignores SIGHUP. So adding that here as well.
  ;-----------------------------------------------------------------------*/

  setsid();
  set_signal_handler(SIGHUP,SIG_IGN);   /* ignore this signal for now */

  /*-------------------------------------------------------------------------
  ; 3. 'fork()' again so the parent, (the session group leader), can exit.
  ; This means that we, as a non-session group leader, can never regain a
  ; controlling terminal.
  ;------------------------------------------------------------------------*/

  pid = fork();
  if (pid == (pid_t)-1)
    return retstatus(false,errno,"fork(2)");
  else if (pid != 0)    /* parent goes bye bye */
    _exit(EXIT_SUCCESS);

  /*-------------------------------------------------------------------------
  ; 4. 'chdir("/")' to ensure that our process doesn't keep any directory in
  ; use. Failure to do this could make it so that an administrator
  ; couldn't unmount a filesystem, because it was our current directory.
  ;
  ; [Equivalently, we could change to any directory containing files
  ; important to the daemon's operation.]
  ;
  ; I just made sure the name of the script we are using contains the full
  ; path.
  ;-------------------------------------------------------------------------*/

  chdir("/");

  /*-----------------------------------------------------------------------
  ; 5. 'umask(022)' so that we have complete control over the permissions of
  ; anything we write. We don't know what umask we may have inherited.
  ;-----------------------------------------------------------------------*/

  umask(022);

  /*-----------------------------------------------------------------------
  ; 6. 'close()' fds 0, 1, and 2. This releases the standard in, out, and 
  ; error we inherited from our parent process. We have no way of knowing
  ; where these fds might have been redirected to. Note that many daemons
  ; use 'sysconf()' to determine the limit '_SC_OPEN_MAX'.
  ; '_SC_OPEN_MAX' tells you the maximun open files/process. Then in a
  ; loop, the daemon can close all possible file descriptors. You have to
  ; decide if you need to do this or not. If you think that there might
  ; be file-descriptors open you should close them, since there's a limit
  ; on number of concurrent file descriptors.
  ;
  ; 7. Establish new open descriptors for stdin, stdout and stderr. Even if
  ; you don't plan to use them, it is still a good idea to have them
  ; open. The precise handling of these is a matter of taste; if you
  ; have a logfile, for example, you might wish to open it as stdout or
  ; stderr, and open '/dev/null' as stdin; alternatively, you could open
  ; '/dev/console' as stderr and/or stdout, and '/dev/null' as stdin, or
  ; any other combination that makes sense for your particular daemon.
  ;
  ; We do this here via dup2(), which combines steps 6 & 7.
  ;------------------------------------------------------------------------*/

  fh = open(DEV_NULL,O_RDWR);
  if (fh == -1)
    return retstatus(false,errno,"open(" DEV_NULL ")");

  assert(fh > 2);

  dup2(fh,STDIN_FILENO);
  dup2(fh,STDOUT_FILENO);
  dup2(fh,STDERR_FILENO);

  close(fh);

  return c_okay;
}

If you want to see this function in context, you can view it here: https://github.com/spc476/syslogintr/blob/master/syslogintr.c

Steps 6 and 7 should occur between steps 1 and 2; furthermore, you ought to issue `closefrom(3)` or equivalent at that point. On some (older) systems, `setsid` fails if you have an open fd referring to the terminal. — zwol, Feb 20 '13 at 14:17

fork a long-running process and avoid having to call waitpid to clean up the zombie?

1 Answers1