13

In my program I am forking (in parallel) child processes in a finite while loop and doing exec on each of them. I want the parent process to resume execution (the point after this while loop ) only after all children have terminated. How should I do that?

i have tried several approaches. In one approach, I made parent pause after while loop and sent some condition from SIGCHLD handler only when waitpid returned error ECHILD(no child remaining) but the problem I am facing in this approach is even before parent has finished forking all processes, retStat becomes -1

    void sigchld_handler(int signo) {
        pid_t pid;
        while((pid= waitpid(-1,NULL,WNOHANG)) > 0);
        if(errno == ECHILD) {
            retStat = -1;
        }
    }

    **//parent process code**
    retStat = 1;
    while(some condition) {
       do fork(and exec);
    }

    while(retStat > 0)
        pause();
//This is the point where I want execution to resumed only when all children have finished
Étienne
  • 4,773
  • 2
  • 33
  • 58
avd
  • 13,993
  • 32
  • 78
  • 99

2 Answers2

22

Instead of calling waitpid in the signal handler, why not create a loop after you have forked all the processes as follows:

while (pid = waitpid(-1, NULL, 0)) {
   if (errno == ECHILD) {
      break;
   }
}

The program should hang in the loop until there are no more children. Then it will fall out and the program will continue. As an additional bonus, the loop will block on waitpid while children are running, so you don't need a busy loop while you wait.

You could also use wait(NULL) which should be equivalent to waitpid(-1, NULL, 0). If there's nothing else you need to do in SIGCHLD, you can set it to SIG_DFL.

Timothy Baldwin
  • 3,551
  • 1
  • 14
  • 23
Jeremy Bourque
  • 3,533
  • 1
  • 21
  • 18
  • I need to give WNOHANG option because if there are many children who deliver SIGCHLD signal almost at the same time, and since signals are not queued in UNIX, if I use 0, then I will be able to catch only one signal then zombies will be left around. – avd Oct 02 '09 at 17:55
  • Just to add a bit, the -1 will check against any child pid (could use WAIT_ANY for clarity also). – amischiefr Oct 02 '09 at 17:57
  • @aditya: The zombie processes are just waiting for you to call wait() (or wait_pid()) on them. As soon as you do, they'll disappear, whether or not you've caught the signal. So the wait() loop after the fork() loop will mop up all the zombies. – Jeremy Bourque Oct 02 '09 at 18:05
  • Agreed -- if you are using `wait/waitpid` then you don't need to handle the `SIGCHLD` yourself. – mob Oct 02 '09 at 21:50
0

I think you should use the waitpid() call. It allows you to wait for "any child process", so if you do that the proper number of times, you should be golden.

If that fails (not sure about the guarantees), you could do the brute-force approach sitting in a loop, doing a waitpid() with the NOHANG option on each of your child PIDs, and then delaying for a while before doing it again.

unwind
  • 391,730
  • 64
  • 469
  • 606
  • Please see the code, I have done exactly the same thing that u r saying. – avd Oct 02 '09 at 17:49
  • @aditya: Uh, no, my suggestion means to do as jborque suggests, I didn't say anything about using a signal handler. – unwind Oct 02 '09 at 18:12