7

Recently, when reading a book about linux programming, I got a message that:

The status argument given to _exit() defines the termination status of the process, which is available to the parent of this process when it calls wait(). Although defined as an int, only the bottom 8 bits of status are actually made available to the parent. And only 0 ~ 127 is recommanded to use, because 128 ~ 255 could be confusing in shell due to some reason. Due to that -1 will become 255 in 2's complement.

The above is about the exit status of a child process.

My question is:

  • Why the parent process only get the 8 bits of the child process's exit status?
  • What about return value of normal functions? Does it reasonable or befinit to use only 0 ~ 127 ? Because I do use -1 as return value to indicate error sometimes, should I correct that in future.

Update - status get by wait() / waitpid():

I read more chps in the book (TLPI), and found there are more trick in the return status & wait()/waitpid() that worth mention, I should have read more chps before ask the question. Anyhow, I have add an answer by myself to describe about it, in case it might help someone in future.

Eric
  • 22,183
  • 20
  • 145
  • 196
  • This is a feature/limitation of the operating system (receiving the return value from `main`) and doesn't apply to internal functions in the program. – Bo Persson Aug 31 '15 at 15:32
  • 1
    If these are the MSBs, then a value of 0..127 would yield `0` anyway. Sure you cited the text correctly? Looks strange to me, it should have been the least significant bits actually. – too honest for this site Aug 31 '15 at 15:32
  • @Olaf 127 will still be 127 I think, because it's '0111,1111', it only take 8 bits. – Eric Aug 31 '15 at 15:37
  • "Because I do use -1 as return value to indicate error sometimes" - While this is not uncommon, it's not awfully clean either. API functions often return `0` for failure and another value for success. This allows idioms like `if ( ! myFunction() ) { bailOutWithErrorMsg(); }`. – JimmyB Aug 31 '15 at 15:57
  • @HannoBinder But it seems a lot system calls and glibc functions return 0 to indicate operation succeed, as well as the `main()`, right? – Eric Aug 31 '15 at 16:01
  • Yes, sorry, I mixed them up :) `0` = OK, `-1` = error seems prevailant. – JimmyB Aug 31 '15 at 16:05
  • @HannoBinder: That seems to be from the [POSIX](http://pubs.opengroup.org/onlinepubs/009695399/functions/exit.html) spec (which makes sense). Any value other than `0` (false) signals an error. The negative values (-128..-1) are used by the system, the positive >0 are for normally application-defined (with possibly further restrictions). That is actually a signed `char` in an `int` which will might be extended to `int` by the environment. – too honest for this site Aug 31 '15 at 16:19
  • @EricWang: Do you actually understand what is meant by "MSBs"? If you only take the 8 MSBs of a 32 bit `int` (as demanded by POSIX) which contains 0..127, you will get `0`. As you cite it, it is definitively wrong, as I presumed. – too honest for this site Aug 31 '15 at 16:22
  • @Olaf The book means `LSB`, not `MSB`, maybe my initial post didn't make that clear. – Eric Aug 31 '15 at 16:27
  • Well "din't make that clear ..." quite a euphemism. You actually stated the contrary. Please think of it next time. – too honest for this site Aug 31 '15 at 16:31
  • You know, English is not my native language, so sometimes I make mistake on that. But sure nice tip, I will pay more attention to the words in future. – Eric Aug 31 '15 at 16:36

2 Answers2

5

Why the parent process only get the 8 bits of the child process's exit status?

Because POSIX says so. And POSIX says so because that's how original Unix worked, and many operating system derived from it and modeled after it continue to work.

What about return value of normal functions?

They are unrelated. Return whatever is reasonable. -1 is as good as any other value, and is in fact a standard way to indicate an error in a huge lot of standard C and POSIX APIs.

n. m. could be an AI
  • 112,515
  • 14
  • 128
  • 243
  • That's what I hope it would be :) – Eric Aug 31 '15 at 15:44
  • POSIX [also states](https://pubs.opengroup.org/onlinepubs/9699919799/functions/_Exit.html): "the full value shall be available from `waitid()` and in the `siginfo_t` passed to a signal handler for `SIGCHLD`." So the full 32 bits **are** available on POSIX-compliant systems. – Andrew Henle Mar 05 '22 at 01:01
2

The answer from @n.m. is good.

But later on, I read more chps in the book (TLPI), and found there are more trick in the return status & wait()/waitpid() that worth mention, and that might be another important or root reason why child process can't use full bits of int when exit.

Wait status

basicly:

  • child process should exit with value within range of 1 byte, which is set as part of the status parameter of wait() / waitpid(),
  • and only 2 LSB bytes of the status is used,

byte usage of status:

    event                   byte 1                  byte 0
    ============================================================
    * normal termination    exit status (0 ~ 255)   0
    * killed by signal      0                       termination signal (!=0)
    * stopped by signal     stop signal             0x7F
    * continued by signal               0xFFFF
    * 

dissect returned status:

header 'sys/wait.h',  defines a set of macros that help to dissect a wait status,

macros:
* WIFEXITED(status)
    return true if child process exit normally,
* 
* WIFSIGNALED(status)
    return true if child process killed by signal,
* WTERMSIG(status)
    return signal number that terminate the process,
* WCOREDUMP(status)
    returns ture if child process produced a core dump file,
    tip:
        this macro is not in SUSv3, might absent on some system,
        thus better check whether it exists first, via:
            #ifdef WCOREDUMP
                // ...
            #endif
* 
* WIFSTOPPED(status)
    return true if child process stopped by signal,
* WSTOPSIG(status)
    return signal number that stopp the process,
* 
* WIFCONTINUED(status)
    return true if child process resumed by signal SIGCONT,
    tip:
        this macro is part of SUSv3, but some old linux or some unix might didn't impl it,
        thus better check whether it exists first, via:
            #ifdef WIFCONTINUED
                // ...
            #endif
* 

Sample code

wait_status_test.c

// dissect status returned by wait()/waitpid()
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>

#define SLEEP_SEC 10 // sleep seconds of child process,

int wait_status_test() {
    pid_t cpid;

    // create child process,
    switch(cpid=fork()) {
        case -1: // failed
            printf("error while fork()\n");
            exit(errno);
        case 0: // success, child process goes here
            sleep(SLEEP_SEC);
            printf("child [%d], going to exit\n",(int)getpid());
            _exit(EXIT_SUCCESS);
            break;
        default: // success, parent process goes here
            printf("parent [%d], child created [%d]\n", (int)getpid(), (int)cpid);
            break;
    }

    // wait child to terminate
    int status;
    int wait_flag = WUNTRACED | WCONTINUED;
    while(1) {
        if((cpid = waitpid(-1, &status, wait_flag)) == -1) {
            if(errno == ECHILD) {
                printf("no more child\n");
                exit(EXIT_SUCCESS);
            } else {
                printf("error while wait()\n");
                exit(-1);
            }
        }
        // disset status
        printf("parent [%d], child [%d] ", (int)getpid(), (int)cpid);
        if(WIFEXITED(status)) { // exit normal
            printf("exit normally with [%d]\n", status);
        } else if(WIFSIGNALED(status)) { // killed by signal
            char *dumpinfo = "unknow";
            #ifdef WCOREDUMP
                dumpinfo = WCOREDUMP(status)?"true":"false";
            #endif
            printf("killed by signal [%d], has dump [%s]\n", WTERMSIG(status), dumpinfo);
        } else if(WIFSTOPPED(status)) { // stopped by signal
            printf("stopped by signal [%d]\n", WSTOPSIG(status));
        #ifdef WIFCONTINUED
        } else if(WIFCONTINUED(status)) { // continued by signal
            printf("continued by signal SIGCONT\n", WSTOPSIG(status));
        #endif
        } else { // this should never happen
            printf("unknow event\n");
        }
    }

    return 0;
}

int main(int argc, char *argv[]) {
    wait_status_test();
    return 0;
}

Compile:

gcc -Wall wait_status_test.c

Execute:

  • ./a.out and wait it to terminate normally, child process id is printed after fork(),
  • ./a.out, then kill -9 <child_process_id> before it finish sleep,
  • ./a.out, then kill -STOP <child_process_id> before it finish sleep, then kill -CONT <child_process_id> to resume it,
Eric
  • 22,183
  • 20
  • 145
  • 196