3

I have a C source code as shown below.

#include<stdio.h>
#include<stdlib.h>
#include<sys/wait.h>
#include<unistd.h>
#include<sys/types.h>

int main(void) {

   pid_t process_id;
   int status;

   if (fork() == 0) 
   {
       if (fork() == 0)
       {
           printf("A");
       } else {
           process_id = wait(&status);
           printf("B");
       }
   } else {
       if (fork() == 0)
       {
           printf("C");
           exit(0);
       }
       printf("D");
   }
   printf("0");
   return 0;   
}

when I executed it in terminal, I got some outputs appeared in this image:

some outputs for the source code written above

I'm actually confused that how these outputs are generated.for example, how D0A0~$ B0C is being generated.

can anyone explain me that how these outputs are being generated and also the functionality of exit(0) in this code?

the busybee
  • 10,755
  • 3
  • 13
  • 30
  • The `~$ ` part of the output is your shell prompt. As for the rest, the only control you have over the output is that `A` must be written before `B` (due to the `wait` call). – Some programmer dude Mar 06 '20 at 07:16
  • As for the `exit` call, it exits the current process immediately. Basically, `exit(0)` is like the `return 0` you have at the end of the `main` function. – Some programmer dude Mar 06 '20 at 07:19
  • Oh, and you have a total of four processes, one which exits before the `printf("0")` call, but the remaining three will "fall through" and print the `0`, and it will be printed after the letter of that specific process. I really recommend some [rubber duck debugging](https://en.wikipedia.org/wiki/Rubber_duck_debugging) here. – Some programmer dude Mar 06 '20 at 07:26
  • @Someprogrammerdude, the `A0` must be printed before `B0`, as both are printed at the end, the output is a terminal and the printf has no newline, so `A0` and `B0` are printed in a single `write(2)` due to stdio buffering. – Luis Colorado Mar 07 '20 at 23:25

2 Answers2

8

In general if you have a code like this

if (fork() == 0) {
  printf("I'm a child\n");
} else {
  printf("I'm a parent\n");
}

printf("This part is common\n");

then the if-branch with zero result of fork() will be executed in a child process and the non-zero branch will be executed in a parent. After that the execution continues in both processes (still asynchronously), so both the child and the parent will execute the code after the if. We can represent it graphically as the following diagram, showing the code which will be executed in each branch:

                                       fork()
                                      /    \
             ------- parent ----------      ---------- child -----------
             |                                                         |
             |                                                         |
   printf("I'm a parent\n");                          printf("I'm a child\n");
   printf("This part is common\n");                   printf("This part is common\n");

Now, let's make the same diagram for your code. After the first fork you split the execution according to the top-most if:

                                    fork()
                                   /    \
         --------- parent ---------      ---------- child -------------
         |                                                            |
         |                                                            |

      if (fork() == 0)                                 if (fork() == 0)
      {                                                {
        printf("C");                                      printf("A");
        exit(0);                                       } else {
      }                                                   process_id = wait(&status);
      printf("D");                                        printf("B");
                                                       }

      // Common code                                  // Common code
      printf("0");                                    printf("0");                
      return 0;                                       return 0;

After next forks are executed in both parent and child we'll get the following tree structure:

                                    fork()
                                   /    \
                ----  parent ------      ------ child ------
                |                                          |
              fork()                                     fork()
              /    \                                     /    \
--- parent ---      --- child ---          --- parent ---      --- child ----
|                               |          |                                 |
|                               |          |                                 |
printf("D");           printf("C");      process_id = wait(&status);      printf("A");
                       exit(0);          printf("B");
                       printf("D");

printf("0");           printf("0");      printf("0");                     printf("0");
return 0;              return 0;         return 0;                        return 0;

Note that printf("D"); appears in both parent-parent and parent-child branches, because it's actually a common code in these two branches after the if(fork()==0){}.

At that point all 4 processes are executing asynchronously.

  • the parent-parent process prints "D", then "0" then exits

  • the parent-child process prints "C" then exits

  • the child-parent process waits for completion of his child, then prints "B", then "0" and exits

  • the child-child process prints "A", then "0" then exits

As you can see the output of these process can be almost arbitrarily interleaved, the only guarantee is that before "B0" will be printed by the child-parent process the "A0" will be printed by the child-child process. The shell which was used to run the program will get the control back after the main process will finish (that is parent-parent process). However, there still might be other process running when the control is back to the shell, so output from some process might appear after the shell outputs its command prompt. For example, the following chain of events is possible:

  • parent-parent gets the control. It prints "D0" and exits, the control is returned back to the shell.

  • child-parent process gets the control. It starts waiting (is blocking) on child-child process.

  • child-child process get the control. It prints "A0" and exits.

  • meanwhile the shell process gets the control and prints command prompt "~$ "

  • the child-parent process gets the control. Since child-child process finished, it is unblocked, prints "B0" and exits.

  • the parent-child process gets the control, prints "C" and exits.

The combined output is "D0A0~$ B0C". It explains the last line in your example.

Alex Sveshnikov
  • 4,214
  • 1
  • 10
  • 26
  • As I understand from your sentence "The shell which was used to run the program will get the control back after the main process will finish (that is parent-parent process)", the output will always begin with "D0". Is that true? @Alex – Mohsen Mahmoodzadeh Mar 06 '20 at 09:08
  • Well, actually there are two guarantees here: that B0 will appear after A0 and that shell prompt ~$ will appear after D0. The sentence that the output will always begin with D0 is not true, however. For example A0CD0B0~$ is quite valid. – Alex Sveshnikov Mar 06 '20 at 09:10
  • 1
    Note there's also no guarantee that the "0" will immediately succeed the previous character from the same process - ie. output could also be "A0BCD00" or any of the other permutations. Additionally, [output buffering could complicate things even further](https://stackoverflow.com/questions/2530663/printf-anomaly-after-fork). – Sander De Dycker Mar 06 '20 at 09:59
  • how is that possible(no guarantee that the "0" will immediately succeed the previous character from the same process)? @Sander De Dycker – Mohsen Mahmoodzadeh Mar 06 '20 at 16:12
  • @mohsen_m In general it's not possible to achieve without a cooperation between processes. For example, if each process locks the stdout file handle, makes an output of both characters and then releases the lock, then the characters will not be separated. There are also other means of synchronization, but it is in general a tricky part. – Alex Sveshnikov Mar 06 '20 at 16:32
  • @mohsen_m : what if the scheduler decides to switch active processes after printing the first character, but before printing the second ? What if two processes run on different cores ? Unless you add inter-process synchronization like Ales mentioned, you really should be thinking of different processes as independent entities with no synchronization guarantees (except the ones that are explicitly added). – Sander De Dycker Mar 09 '20 at 07:18
  • @mohsen_m, each `write(2)` call writes the two characters at the end of the program. As the `write()` call locks the inode, no other process can write on that inode (the terminal) until the process writing has finished the system call. Even in a multiprocessor system, the writes go atomically. – Luis Colorado Mar 09 '20 at 17:05
  • @mohsen_m, the output doesn't have to begin neccessary with `D0`, as the main process writes that `D0` at the end of its run. That `D0` can perfectly go after `A0`, `B0` and `C`. What will never happen is to go after the shell prompt. – Luis Colorado Mar 09 '20 at 17:10
  • @mohsen_m, just put a `sleep(1);` after the line `print("D");`, and you'll see the program printing `D0` at the end, just before the shell prompt. – Luis Colorado Mar 09 '20 at 17:18
  • @SanderDeDycker, your comment is not true, `printf()` buffers the two caracters, and they get printed with a single `write(2)` that locks the inode and forces them to appear together in the output... try it, because even in a multiprocessor system, you'll always get the letter and the digit together. – Luis Colorado Mar 09 '20 at 17:22
  • 1
    @LuisColorado : `printf` does not buffer - the output stream might, but that's [not guaranteed](https://stackoverflow.com/questions/3723795/is-stdout-line-buffered-unbuffered-or-indeterminate-by-default). If the output stream doesn't buffer, then the scenario I pointed out can happen. Try it yourself eg. by forcing `stdout` to be unbuffered using [`setvbuf(stdout, 0, _IONBF, 0);`](https://en.cppreference.com/w/c/io/setvbuf). To reproduce it easily with such a small program, you'll probably have to add a `sleep` or `yield` before printing the `0` to force the scheduler to switch processes. – Sander De Dycker Mar 10 '20 at 07:40
0

There are four processes involved here, let's call them a, b, c, and d (d is the parent of b and c, and b is the parent of a):

shell
    `-d
      +-b
      | `-a
      `-c
  • d is the parent process that executes the first fork(2) call (creating process b). As it is the parent, it will go to the else statement of that first if and will fork(2) again (creating process c), then prints at the end the string D0 (both characters are written always together in a single write(2) call, as printf(3) buffers the data, see below the reason)
  • the first fork of d produces process b, that, after doing a second fork(2) (getting process a) waits for it to finish, and then prints B0.
  • the second fork() of b (well, b is created after the first fork of the listing, and only does one fork, but it is the second fork in the listing), produces process a, that prints A0, and then exits, making process b able to continue after the wait(2) call and print B0, this forces the order to be always A0 before B0.
  • the third fork() produces process c, whose task is to print C and exit(3). (this makes no C0 to be output, but just C)
  • as d doesn't wait for any of its two children, once D0 is printed, it exit(3)s, making the shell to output the prompt ~$ and to wait for a new command. This forces the order D0 before the shell prompt.
  • the ^C is a Control-C pressed by you, thinking that the program was still running, it makes the shell to emit a second prompt in a new line.

As far as the only processes that wait for their children are the shell (forcing the prompt to be printed after D0), and process b waiting for process a (forcing always A0 to be printed before B0) any other sequence is permitted, depending on how the system schedules the processes. and that includes the prompt. Think that the messages are always printed at the end of execution of all the processes involved.

The possible orders are the permutations of A0, B0, D0, C and the shell prompt, but of these number, half have the order of A0 and B0 changed, and of this half, half have the order of the shell and D0 exchanged... so the number of possibilities should be 5!/4 == 30 possibilities. See if you can get them all!!! :)

explanation of D0A0~$ B0C

A possible scheduling that produces the output above is the following:

  • the shell starts process d and waits for it to finish.
  • process d forks twice, creating processes b and c. Prints D0, and exits.
  • process b forks, and creates process a, and waits for a to finish.
  • process a prints A0 and exits.
  • the shell awakens from wait() and prints the prompt ~$.
  • process b awakens from wait() for a, and prints B0.
  • process c prints C and exits.

printf() buffering.

in terminal mode (when output is to a terminal) stdout buffers in line mode. If you have a let's say, 512 byte buffer, it begins filling it until it sees a \n character, or the buffer fills completely, then it flushes all buffer contents to standard output with a single write(2) call. This makes that the sequence:

printf("D");
...
printf("0");

makes both characters to accumulate in the buffer, and be printed together at the end of the process actually, when exit(3) calls the routine the stdio package installs with the atexit(3) call.

The possible outputs:

all the possible combinations are shown below:

D0A0B0C~$, D0A0B0~$ C, D0A0CB0~$, D0A0C~$ B0, D0A0~$ CB0, D0A0~$ B0C, D0CA0B0~$, D0CA0~$ B0, D0C~$ A0B0, D0~$ CA0B0, D0~$ A0CB0, D0~$ A0B0C, A0D0B0C~$, A0D0B0~$ C, A0D0CB0~$, A0D0C~$ B0, A0D0~$ CB0, A0D0~$ B0C, A0B0D0C~$, A0B0D0~$ C, A0B0CD0~$, A0CB0D0~$, A0CD0B0~$, A0CD0~$ B0, CA0B0D0~$, CA0D0B0~$, CA0D0~$ B0, CD0A0B0~$, CD0A0~$ B0, CD0~$ A0B0.

Luis Colorado
  • 10,974
  • 1
  • 16
  • 31