0

I have a SIGCHILD handler installed in my shell. Let's say I'm using waitpid to wait for a foreground process to terminate so that I can reap it.

From what I know, waitpid suspends the current process and waits for the given process to terminate, yes?

What if one of the background processes terminated while my shell is waiting for the foreground process? When will my shell reap the terminated background process? How can I make sure that the background process will be reaped immediately?

Fadhil Abubaker
  • 137
  • 1
  • 9

1 Answers1

0

Simply make your parent shell wait in a loop for any child that dies, and when waitpid() returns you analyze what to do as a consequence of that child dying.

int status;
int corpse;

while ((corpse = waitpid(-1, &status, 0)) != -1)
{
    …perform post-mortem actions…
} 
  • If the foreground process died (completed), you'll proceed to prompt for the next command.
  • If a background process died, you'll note that it is gone, record its status perhaps, and go back for another cycle of the loop.

When the foreground process exits, you'll report on any children that died before issuing the next prompt.

You should never exit that loop because of the -1 value.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • If you're waiting in a loop, you may want to use `WNOHANG`. – EOF Oct 31 '15 at 15:24
  • @EOF: I don't want my shell being busy while the foreground process is running. At this stage, I know I'm waiting for the foreground process to complete. Any background processes that die need to be fielded too, but they're coincidental. I want the shell to sleep until a process dies; I do not want WNOHANG. It would be different if I was checking _whether_ any process has died; then I'd use WNOHANG. Here, the context is explicitly waiting for a process to die and WNOHANG is not correct. And if checking 'whether', it would be feasible to exit the loop with -1 (there aren't any children left). – Jonathan Leffler Oct 31 '15 at 15:28
  • If a BG process terminated while the parent shell is running an FG process, I can't wait for the FG process to terminate; I need to reap the BG process immediately. Is this possible? – Fadhil Abubaker Oct 31 '15 at 15:29
  • @JonathanLeffler: Well, you'll *eventually* need a `WNOHANG` wait, for any background-process that terminated after the foreground-process you just waited for. – EOF Oct 31 '15 at 15:31
  • @ReiJinThunderKeg: The whole answer says "Yes" and shows how to do it. – Jonathan Leffler Oct 31 '15 at 15:32
  • Pardon me if I'm wrong, but if I use waitpid in my parent shell, won't it suspend the SIGCHILD handler? Will I still catch SIGCHILD signals this way? – Fadhil Abubaker Oct 31 '15 at 15:38
  • `waitpid()` is an interruptible system call; SIGCHLD is a signal. OTOH, if you're waiting for children to die in a loop like this, SIGCHLD is not all that relevant; maybe you want to suspend the signal handler for the time being? SIGCHLD is a funny signal; you don't want to SIG_IGN it because then `waitpid()` and its relatives won't get told about children dying. You'd reinstate SIG_DFL for the time being, then reset the signal handler. And worry about using `waitpid()` with WNOHANG to pick up any corpses that died while you weren't watching with the signal handler. – Jonathan Leffler Oct 31 '15 at 15:46
  • So say a BG job terminates and my shell catches the signal. But isn't there a race condition here between my parent shell and the child FG process? For instance, my parent shell is about to execute the handler, but the OS suddenly context switches and now the FG process is running. Won't this be a problem? – Fadhil Abubaker Oct 31 '15 at 15:51
  • Or, you can decide to use the signal handler to deal with things, but you'll need to be careful. Signal handlers should usually be minimal (see [How to avoid using `printf()` in a signal handler](http://stackoverflow.com/questions/16891019/) for more information. And if you have signal handling in place, you might need to refine the loop shown to `while ((corpse = waitpid(-1, &status, 0)) != -1 || errno == EINTR)` and modify the body of the loop to only do post mortem analysis when `corpse != -1`. You'll also need to check whether the signal was from the foreground child. – Jonathan Leffler Oct 31 '15 at 15:51
  • Which race condition? – Jonathan Leffler Oct 31 '15 at 15:52
  • My parent shell is about to execute the handler upon receiving the signal, but the OS suddenly context switches and now the FG process is running. Won't this be a problem? – Fadhil Abubaker Oct 31 '15 at 15:58
  • No; that's what you want to happen. You want your shell to be mostly asleep while the foreground process is running. – Jonathan Leffler Oct 31 '15 at 16:31