2

So I know that as stated in the fork manual, forked child processes have a copy of the open fd from the parent, so both the parent and child share file offsets.

Stdio streams are built on top of fd functions, but they includes buffers that are unique for each process, so at the moment of forking the child inherits those FILE*, but do they share the same buffers and the same file offsets?

I've searched through the manual and this forum but have found nothing about FILE*, only about file descriptors. Sorry if this question is a duplicate but I haven't been able to find it.

Rachid K.
  • 4,490
  • 3
  • 11
  • 30
Alex_01
  • 21
  • 2
  • 2
    the `FILE` structures and buffers are in memory, and the child gets a copy of the memory. So any buffered data that hasn't been flushed will be in both parent and child. Changes to the buffers that are made after the fork are private. – Barmar Apr 10 '23 at 16:33
  • 2
    See [`printf()` anomaly after `fork()`](https://stackoverflow.com/q/2530663/15168) for lots of details. The processes share an 'open file description' after the `fork()` — that's different from an 'open file descriptor'. But it does mean that the actions of the parent via the file stream affect the underlying file descriptor in the child, and vice versa. The mode flags for `open()` are also relevant (e.g. `O_APPEND` or not) — and the more sophisticated flags are largely hidden from you by `fopen()` (you can specify explicitly with `open()` many flags that you cannot specify with `fopen()`). – Jonathan Leffler Apr 10 '23 at 17:26
  • So for example i have this sequence of characters: abcdefghaijklmn in a file (which both child and parent have access to via the same FILE*) and i do a rewind on the child FILE* and then I read a single character from the parent FILE*, it should read the 'a'? – Alex_01 Apr 11 '23 at 18:47
  • It depends on the state of the `FILE` objects in the parent and child at the time. Generally, the only reasonable way you can use a FILE in both the parent and the child after forking is if it in append mode or on a non-seekable device, and you fflush it before the fork. After a fork, you can't fseek on one and expect the other to be in a sensible state. – Chris Dodd Apr 12 '23 at 07:23

1 Answers1

1

As stated in the manual of fork() and as you said in your post:

The child inherits copies of the parent's set of open file descriptors. Each file descriptor in the child refers to the same open file description (see open(2)) as the corresponding file descriptor in the parent. This means that the two file descriptors share open file status flags, file offset, and signal-driven I/O attributes (see the description of F_SETOWN and F_SETSIG in fcntl(2)).

FILE is a data structure on top of the OS file descriptor data structure. It is basically a dynamic allocation in memory which is copied from the father to the child process at fork time. Hence, any buffered data in this data structure will be duplicated in the child process. And both of them may flush it at any time into the target file. As the file offsets are shared between father and child, this will trigger duplicate prints on the output. Consider the following program as example:

#include <sys/types.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/wait.h>

int main(void)
{

  printf("String to be printed on stdout");

  if (fork() == 0) {

    // Child
    fflush(stdout);
    
  } else {

    // Father
    fflush(stdout);

    wait(NULL);
  }

}

When running it, we can see two prints resulting from a single printf() originally called from the father:

$ gcc try.c
$ ./a.out 
String to be printed on stdoutString to be printed on stdout$

And of course, any future prints on child or father side will be mixed on the output as both of them will move the same underlying file offset.

NB: The goal of the stdio subsystem in the C library is to minimize the number of system calls by using an intermediate buffer into which the I/O are done first. The actual system calls are triggered when there is no choice (buffer full needs to be flushed to append data into the output file, the seek operation goes out of the bounds currently mapped by the buffer, explicit flush request...). So, if one of the processes calls rewind() or ftell(), this may result into an actual system call or not depending on the state of the buffer in the calling process. The other process will be affected only if a system call is done. There is not a 1/1 correspondence between the library calls and the equivallent system calls. Typically, there will be more library calls than system calls (e.g. multiple fwrite() may trigger only one write() system call).

Rachid K.
  • 4,490
  • 3
  • 11
  • 30
  • Thanks, i think that i understand the buffer behaviour after a fork. But what about the file offsets? If i do a rewind on the child FILE*, and then try to read from the parents FILE*, will it be affected by the child's rewind? – Alex_01 Apr 11 '23 at 18:41