1

I'm using PCNTL to multiprocess a big script in PHP on an ubuntu server.
Here is the code ( simplified and commented )

function signalHandler($signo = null) {
    $pid = posix_getpid();
    switch ($signo) {
        case SIGTERM:
        case SIGINT:
        case SIGKILL:
            // a process is asked to stop (from user or father)
            exit(3);
            break;
        case SIGCHLD:
        case SIGHUP:
            // ignore signals
            break;
        case 10: // signal user 1
            // a process finished its work
            exit(0);
            break;
        case 12: // signal user 2
            // a process got an error.
            exit(3);
            break;
        default:
            // nothing
    }
}

public static function run($nbProcess, $nbTasks, $taskFunc, $args) {
    $pid = 0;
    // there will be $nbTasks tasks to do, and no more than $nbProcess children must work at the same time
    $MAX_PROCESS = $nbProcess;
    $pidFather = posix_getpid();

    $data = array();

    pcntl_signal(SIGTERM, "signalHandler");
    pcntl_signal(SIGINT, "signalHandler");
//  pcntl_signal(SIGKILL, "signalHandler"); // SIGKILL can't be overloaded
    pcntl_signal(SIGCHLD, "signalHandler");
    pcntl_signal(SIGHUP, "signalHandler");
    pcntl_signal(10, "signalHandler"); // user signal 1
    pcntl_signal(12, "signalHandler"); // user signal 2

    for ($indexTask = 0; $indexTask < $nbTasks ; $indexTask++) {
        $pid = pcntl_fork();
        // Father and new child both read code from here

        if ($pid == -1) {
            // log error
            return false;
        } elseif ($pid > 0) {
            // We are in father process
            // storing child id in an array
            $arrayPid[$pid] = $indexTask;
        } else {
            // We are in child, nothing to do now
        }

        if ($pid == 0) {
            // We are in child process

            $pidChild = posix_getpid();

            try {
                //$taskFunc is an array containing an object, and the method to call from that object
                $ret = (array) call_user_func($taskFunc, $indexTask, $args);// similar to $ret = (array) $taskFunc($indexTask, $args);

                $returnArray = array(
                                    "tasknb" => $indexTask,
                                    "time" => $timer,
                                    "data" => $ret,
                );
            } catch(Exception $e) {
                // some stuff to exit child
            }

            $pdata = array();
            array_push($pdata, $returnArray);
            $data_str = serialize($pdata);

            $shm_id = shmop_open($pidChild, "c", 0644, strlen($data_str));
            if (!$shm_id) {
                // log error
            } else {
                if(shmop_write($shm_id, $data_str, 0) != strlen($data_str)) {
                    // log error
                }
            }
            // We are in a child and job is done. Let's exit !
            posix_kill($pidChild, 10); // sending user signal 1 (OK)
            pcntl_signal_dispatch();
        } else {
            // we are in father process,
            // we check number of running children
            while (count($arrayPid) >= $MAX_PROCESS) {
                // There are more children than allowed
                // waiting for any child to send signal
                $pid = pcntl_wait($status);
                // A child sent a signal !

                if ($pid == -1) {
                    // log error
                }

                if (pcntl_wifexited($status)) {
                    $statusChild = pcntl_wexitstatus($status);
                } else
                    $statusChild = $status;

                // father ($pidFather) saw a child ($pid) exiting with status $statusChild (or $status ?)
                //                                                                ^^^^          ^^^^^^
                //                                                                (=3)  (= random number ?)
                if(isset($arrayPid[$pid])) {
                    // father knows this child
                    unset($arrayPid[$pid]);
                    if ($statusChild == 0 || $statusChild == 10 || $statusChild == 255) {
                        // Child did not report any error
                        $shm_id = shmop_open($pid, "a", 0, 0);
                        if ($shm_id === false)
                            // log error
                        else {
                            $shm_data = unserialize(shmop_read($shm_id, 0, shmop_size($shm_id)));
                            shmop_delete($shm_id);
                            shmop_close($shm_id);
                            $data = array_merge($data, $shm_data);
                        }
                        // kill (again) child
                        posix_kill($pid, 10);
                        pcntl_signal_dispatch();;
                    }
                    else {
                        // Child reported an error
                    }
                }
            }
        }
    }
}

The problem I'm facing is about the value returned by wexitstatus.
To make it simple, there is a father-process, that must create 200 threads.
He makes process one at a time, and wait for a process to finish if there are more than 8 threads actually running.
I added many logs, so I see a child finished its work.
I see it calling the line posix_kill($pidChild, 10);.
I see the signal handler is called with signal user 1 (which results in an exit(0)).
I see the father awakening, but when he gets the returned code from wexitstatus, he sees a code 3, and so thinks the child got an error, whereas it has exited with code 0 !!.
The pid is the good child's pid.
Maybe I misunderstand how signals work... Any clue ?

Random
  • 3,158
  • 1
  • 15
  • 25
  • Why don't you install `pthreads` extension and just use threads? These are processes, not threads. Anyway, when your child **process** is finished, call `exit(0);` - that indicates successful program termination. Don't raise signals on your own unless an error happens. – N.B. May 22 '15 at 14:57
  • @N.B. Never heard about `pthreads` but anyway I have no choice, I have to use PCNTL (unless there is a **real** problem with PCNTL). I go checking the difference between thread and processes to see what is relevant... I'll also try to exit(0) directly and come back in few minutes, thank you ! :) – Random May 22 '15 at 15:06
  • [Here's the GitHUB repo for the pthreads extension](https://github.com/krakjoe/pthreads). Onto your issue: parent process catches all signals that a child emits. When the child is done with the work, it should exit with status 0 which indicates success. All other exit codes indicate an error of a sort. If I remember correctly, status codes `0` and `8` are clean child process exit codes. Now, what I would like to know is what are you trying to achieve with this script and why is every child process writing to a separate block of shared memory? – N.B. May 22 '15 at 15:13
  • @N.B. each thread is writing and email-ing a PDF. We were currently monoprocessing it but while sending email, the process just wait, so it is a waste of time. We now use 8 processes (on quad-core proc) to solve this problem. The shared memory is used to get data about minor errors that occured in a child-process (that we print when all child finished their job). I didn't write this code, but I'm asked to use this code and make it work for this task. So I documented much about PCNTL (which I think I understand well), but didn't have a look on shmop. Is there something wrong ? – Random May 22 '15 at 15:24
  • I'm just asking about what you're doing - so, you have some long running task and you're trying to do it asynchronously which is fine. However, I asked about shared memory because that is supposed to be shared yet every process acquires its own. There's nothing wrong with the approach, it's just that some things aren't as clear. Have you considered using a message queue and task distribution pattern perhaps? What you have there is a supervisor process with child worker processes. Using ZeroMQ and fan-out approach is much, much less code than what you have, if you're able to change it of course – N.B. May 22 '15 at 15:28
  • @N.B. This code is used in 3 different applications, so it has to keep the same pattern, I don't have much liberty about that. If I change it, I'll have to change it on other applications too (which is long and risky...), and justify why it is required (and so, more efficient...). What I can do is 'adapt' the code to work on my application, and to look like other's applications one... I also edited my question in order to use `process` instead of `thread` – Random May 22 '15 at 15:40
  • @N.B In fact, there were a `shutdown function` which were defined. So an exit(0) called this shutdown function and forced the exit code to be 3. Thanks for you help ! – Random May 25 '15 at 09:21
  • Well, I must admit - very nicely spotted, this isn't exactly trivial thing to do so have my upvote, I'm sure this will help someone in the future :) – N.B. May 25 '15 at 11:02

1 Answers1

1

I found the problem.
In my application, register_shutdown_function(myFrameworkShutdownFunction) was used to shut down the script "smoothly".
So an exit(0) didn't immediately stop the child process. It first went into myFrameworkShutdownFunction, and converted the return code 0 to a code 3 (because of a misconfigured variable).

Random
  • 3,158
  • 1
  • 15
  • 25