As stated in the manual of fork()
and as you said in your post:
The child inherits copies of the parent's set of open file
descriptors. Each file descriptor in the child refers to the
same open file description (see open(2)) as the corresponding
file descriptor in the parent. This means that the two file
descriptors share open file status flags, file offset, and
signal-driven I/O attributes (see the description of F_SETOWN
and F_SETSIG in fcntl(2)).
FILE
is a data structure on top of the OS file descriptor data structure. It is basically a dynamic allocation in memory which is copied from the father to the child process at fork time. Hence, any buffered data in this data structure will be duplicated in the child process. And both of them may flush it at any time into the target file. As the file offsets are shared between father and child, this will trigger duplicate prints on the output. Consider the following program as example:
#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>
int main(void)
{
printf("String to be printed on stdout");
if (fork() == 0) {
// Child
fflush(stdout);
} else {
// Father
fflush(stdout);
wait(NULL);
}
}
When running it, we can see two prints resulting from a single printf()
originally called from the father:
$ gcc try.c
$ ./a.out
String to be printed on stdoutString to be printed on stdout$
And of course, any future prints on child or father side will be mixed on the output as both of them will move the same underlying file offset.
NB: The goal of the stdio subsystem in the C library is to minimize the number of system calls by using an intermediate buffer into which the I/O are done first. The actual system calls are triggered when there is no choice (buffer full needs to be flushed to append data into the output file, the seek operation goes out of the bounds currently mapped by the buffer, explicit flush request...). So, if one of the processes calls rewind()
or ftell()
, this may result into an actual system call or not depending on the state of the buffer in the calling process. The other process will be affected only if a system call is done. There is not a 1/1 correspondence between the library calls and the equivallent system calls. Typically, there will be more library calls than system calls (e.g. multiple fwrite()
may trigger only one write()
system call).