Using fork with c

Question

It is an academic question so the reason is to understand the output.

I have a code:

int main(int argc, char **argv) {
            int k ;
            while(*(++argv)) {
                    k = fork();
                    printf("%s ",*argv);
            }
    return 0;
    }

running the program with : prog a b
The output is :

  a b a b a b a b

Why do I get this result?

`while(*(++argv))`? That's ugly, I can't see what it's doing (can you?); break it down. I have a feeling that's where the error is coming from. — user541686, Jun 18 '11 at 22:05
I could recreate this, but if I changed the `"%s "` to `"%s\n"`, it behaved as one would expect (a\na\nb\nb\nb\nb\n). — Ben Hocking, Jun 18 '11 at 22:11
@Ben - That would be due to bufferring. It's too subtle for me to explain it fully in an answer from my iPhone, but basically `fork` copies the (never flushed) buffer of `stdout` every time. — Chris Lutz, Jun 18 '11 at 22:17
The code is not written by me . I just want to understand the output. — igascream, Jun 18 '11 at 22:18
@Chris Lutz I will put that in an answer if you cannot do it conveniently. I bow my hat to you, sir. — Pascal Cuoq, Jun 18 '11 at 22:18
@Pascal - Go for it. I'm so happy to have actually figured it out that I don't particularly care about the rep. — Chris Lutz, Jun 18 '11 at 22:21

Pascal Cuoq · Accepted Answer · 2011-06-18T22:29:54.303

As suggested by Chris Lutz in the comments, you are observing the effect of a static buffer used by printf being duplicated by the fork() call. The two processes created by the first fork() do not print b (as you could expect, and as happens if you force a flush). They both print a b because they both have a pending, unflushed a in their respective buffers.

There are 4 processes (2^2, including the initial one), they all only really print at exit when the buffer is flushed, and they all have a b in their respective buffers at that time.

score 1 · Answer 2 · answered Jun 18 '11 at 22:23

In the beginning argv will point to argv[0] which is the executable file name, it's increased once inside while() to point to argv[1].

Now it hits fork() creating a second thread starting at the same line.

Both threads will write a to their own stdout buffer.

Now argv is moved by 1 character in both instances (inside while()), as they essentially work with copies if I remember that correctly.

The fork in each thread will now create 2 additional copies of the thread (one for each existing thread).

Now the 4 instances will all have the 'a ' still in their stdout buffer that is copied (think so, would be nice if anyone could confirm this) and their argv pointing to b. This one is written as well, so now we've got 4 threads each having 'a b ' in their output buffers.

Once they end, their buffers are flushed resulting in the 'a b a b a b a b ' (essentially being 'a b ', 'a b ', 'a b ', and 'a b ').

Ben's comment can be explained by flushing caused by the linebreaks.

You can also verify that the buffer is to blame by adding `fflush(stdout);` at the end of each loop iteration. That should print the expected "a a b b b b " output. — Chris Lutz, Jun 18 '11 at 22:29
I just wasn't 100% sure about the buffer being copied as well. — Mario, Jun 18 '11 at 22:33
The OS may do some fancy copy-on-write semantics but it should copy the data somewhere between the fork and writing to the buffer. — Chris Lutz, Jun 18 '11 at 22:37

score 0 · Answer 3 · answered Jun 18 '11 at 22:06

0

I don't know about "fork", but I can tell you that on the second iteration of the loop you are accessing a meaningless area of memory. That surely returns corrupted, meaningless results, and worse, it can break the execution of the program.

You should, of course, be doing something like

while ((--argc) >= 0) { (...) }

if you don't need to know the original value of argc after the loop, or else

int i = 0;
while (i++ < argc) { (...) }

answered Jun 18 '11 at 22:06

SQL Learner

79
1
7

1

The standard specifies that `argv[argc] == NULL`, ensuring that the loop ends safely. That's not the problem. – Chris Lutz Jun 18 '11 at 22:08
Hmm... I didn't know about that. Sorry then. But, anyway, it's clear that he should not be using `*(++argv)` since, anyway, on the second iteration it is meaningless. – SQL Learner Jun 18 '11 at 22:10
On the last iteration it's meaningless. It's purpose in the loop is to skip `argv[0]` which is the program name. It's ugly but it works. – Chris Lutz Jun 18 '11 at 22:18

score 0 · Answer 4 · answered Jun 18 '11 at 22:46

First things first, terminal is line buffered, i.e, buffers are flushed when newline is encountered. If you put a newline in printf() than results would change. However, if you are writing to a file, which is fully buffered, there will be no change in the output even if you add the newline to printf().

After the first fork() call, P (parent) has 'a' and C1 also has 'a'. (C for child).

Then, after the second fork call, two new children are created C2 and C3. The buffers of the process are also copied, so C2 and C3 also contains 'a' now. After the printf() call, all the processes contain 'a b' in their buffers. And when they exit, their buffers are flushed, and hence the output.

Using fork with c

4 Answers4

Linked

Related