There is a somewhat famous Unix brain-teaser: Write an if
expression to make the following program print Hello, world!
on the screen. The expr
in if
must be a legal C expression and should not contain other program structures.
if (expr)
printf("Hello, ");
else
printf("world!\n");
The answer is fork()
.
When I was younger, I just had a laugh and forgot about it. But rethinking it, I find I couldn't understand why this program is surprisingly reliable than it should be. The order of execution after fork()
is not guaranteed and a race condition exists, but in practice, you almost always see Hello, world!\n
, never world!\nHello,
.
To demonstrate it, I ran the program for 100,000 rounds.
for i in {0..100000}; do
./fork >> log
done
On Linux 5.9 (Fedora 32, gcc 10.2.1, -O2
), after 100001 executions, the child only won 146 times, the parent has a winning probability of 99.9985%.
$ uname -a
Linux openwork 5.9.14-1.qubes.x86_64 #1 SMP Tue Dec 15 17:29:47 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ wc -l log
100001 log
$ grep ^world log | wc -l
146
The result is similar on FreeBSD 12.2 (clang 10.0.1, -O2
). The child only won 68 times, or 0.00067% of the time, meanwhile the parent won 99.993% of all executions.
An interesting side-note is that ktrace ./fork
instantly changes the dominant result to world\nHello,
(because only the parent is traced), demonstrating the Heisenbug nature of the problem. Nevertheless, tracing both processes via ktrace -i ./fork
reverts the behavior back, because both processes are traced and equally slow.
$ uname -a
FreeBSD freebsd 12.2-RELEASE-p1 FreeBSD 12.2-RELEASE-p1 GENERIC amd64
$ wc -l log
100001 log
$ grep ^world log | wc -l
68
Independence from Buffering?
An answer suggests that buffering can influence the behavior of this race condition. But the behavior still presents after removing \n
from printf().
if (expr)
printf("Hello");
else
printf("World");
And turning off stdout's buffering via stdbuf
on FreeBSD.
for i in {0..10000}; do
stdbuf -i0 -o0 -e0 ./fork >> log
echo > log
done
$ wc -l log
10001 log
$ grep -v "^HelloWorld" log | wc -l
30
Why does printf()
in the parent almost always win the race condition after fork()
in practice? Is it related to the internal implementation details of printf()
in the C standard library? The write()
system call? Or process scheduling in the Unix kernels?