0

I've tried to use the following (Will wait and waitpid block SIGCHLD and unblock it when they return in Linux?) to answer my question below. The link's resolution leads me to the understanding I have below. Why is it mistaken, then?


Given the following:

  • The Handler

    void sig_child(int sig)
    {
    signal(SIGCHLD, sig_child);
    
    if (sig == SIGCHLD)
    {
        int status;
        pid_t num;
        while ((num = waitpid(WAIT_ANY, &status, WNOHANG)) > 0)
        {
    
        }
        if (num == -1 && errno != ECHILD)
        {
            // We shouldn't reach here
            assert(0);
        }
    
    }
    

The main blurb:

           if (pidnum)
           {  //Some work
                int status;
                pid_t num = waitpid(pidnum, &status, 0);
               // Some work
            }

I think that the following should take place:

1) The main code (in the second blurb) calls waitpid

2) SIG_CHLD is raised, so control transfers to the signal handler

3) The signal handler removes the terminated child from the signal handler, obtaining its status information

4) When control transfers back to the main code (the second blurb), since the process's child has already been removed from the process table, num should be set to -1


In fact, the following takes place (I'd used some print statements that are now removed to trace the flow), contradicting my understanding above:

1) The main blurb reaches waitpid

2) SIG_CHLD is raised, so control transfers to the handler

3) num is set to -1 and errno to ECHILD

4) Control returns to the main blurb, where num is given the pid of the child that had been forked before.

Muno
  • 575
  • 2
  • 5
  • 20
  • Can you explain what you're ultimately trying to achieve? Why are you calling `waitpid` twice? – Shachar Shemesh Oct 02 '17 at 10:26
  • I want to program a mini shell where I can both set up background processes and foreground processes. I call waitpid in the main blurb -- without WNOHANG -- because I want my shell to wait until its child process finishes. There is another section of the main blurb (not shown) in which I spawn background processes. These are handled by the signal handler. I handle these background processes by waiting with WNOHANG, so that if multiple background processes happen to terminate around the same time, I can reap them all. – Muno Oct 02 '17 at 12:58

1 Answers1

0

You are doing two waitpid with overlapping criteria, and complain that one catches your process and the other doesn't instead of the other way around. I don't think there is any way for this particular case to be resolved.

The way I see it, you have two paths to a solution. One is to use non-overlapping waitpids criteria. The problem is that it isn't easy to see how you can create such a case.

The more reasonable solution is to have just one waitpid in the main loop. Have that wait examine the pid that was reported, and handle it in two different ways. If it is the child of the main program, do one thing. If it is one of the background tasks you are running, do another (or, in your case, nothing).

If you cannot, for whatever reason, handle the background processes' exit while not inside the main loop, just block SIGCHLD while you're doing the master wait.

In any case, always bear in mind that SIGCHLD is a non-queuing signal. This means that you can never get assurance that there is a 1:1 relationship between the number of times your signal handler gets called and the number of waits you run.

Shachar Shemesh
  • 8,193
  • 6
  • 25
  • 57