3

I've found that a open filestream will get messed up if we do fork() before closing it. It is well known that concurrency, i.e., race conditions can happen when parent and child process want to modify the filestream. However, even when the child process doesn't ever touch the filestream, it still has undefined behavior. I was wondering if someone can explain this maybe from how the kernel deals with a filestream during the stages where child process is forked and exited.

Below is a quick snippet of a strange behavior:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h> 
#include <sys/wait.h> 

int main() {
    // Open file
    FILE* fp = fopen("test.txt", "r");

    int count = 0;
    char* buffer = NULL;
    size_t capacity = 0;
    ssize_t line = 0;
    while ( (line = getline(&buffer, &capacity, fp)) != -1 ) {
        if (line > 0 && buffer[line - 1] == '\n') // remove the end '\n'
            buffer[line - 1] = 0;

        pid_t pid = fork();
        if (pid == 0) {
            // fclose(fp); // Magic line here: when you add this, everything is fine
            if (*buffer == '2')
                execlp("xyz", "xyz", NULL);
            else
                execlp("pwd", "pwd", NULL);
            exit(1);
        } else {
            waitpid(pid, NULL, 0);
        }
        count++;
    }
    printf("Loops: %d\n", count);
    return 0;
}

Just copy the code into a new file (e.g., test.c). And create a .txt file test.txt with the simple content

1
2
3
4

and run

$ gcc test.c && ./a.out

There are 4 lines in the file. The loops is expected to read each line and execute exactly 4 times (1 2 3 4). And I choose to let it exec an invalid command "xyz" when it's in the 2nd loop. Then, you will find the loop actually executes 6 times (1 2 3 4 3 4)! The fact is that, when all four commands executed are valid, nothing will go wrong. But if there is an invalid command executed, every command after it will be executed twice. (Please note that this strange behavior only occurs with Linux machine, my Mac OS is doing okay, not sure about Windows. So the problem is platform-dependent?)

It looks like whenever I fork(), the filestream in parent is no longer promised to be the old fp (non-deterministic behavior), even when my child process doesn't touch it.

A temporary solution I found is: fclose(fp) in child process. This will silence the above strange behavior, but in more complex conditions, there are still other things can be observed. It would be appreciated if somebody can give me some insight into this problem. Thanks

  • 2
    It is necessary to close open file descriptors before calling `execlp`, see [here](https://stackoverflow.com/questions/37582058/is-closing-a-pipe-necessary-when-followed-by-execlp). So your temporary solution is not temporary, but a real one. The "other things" might have different reasons. – Eugene Sh. Sep 28 '18 at 21:17
  • 1
    All of your problems will be resolved if you do _both_ of the following things: (1) call `fflush(0)` immediately before calling `fork`; (2) call `_exit` instead of `exit` when `exec` fails. It is actually _wrong_ to call `fclose(fp)` in the child process. – zwol Sep 28 '18 at 21:28
  • 1
    Addendum: It may be appropriate to call `close(fileno(fp))` in the child before calling `exec`, and/or to set the close-on-exec bit on that file descriptor immediately after the `fopen`. However, whether this is actually correct depends on details of your full program that you have not shown us. – zwol Sep 28 '18 at 21:33
  • Thank much guys, I will try these solutions – Symphony Huang Sep 28 '18 at 22:14
  • See my answer to the question [Why does forking my process cause the file to be read infinitely?](https://stackoverflow.com/questions/50110992/why-does-forking-my-process-cause-the-file-to-be-read-infinitely/50112169#50112169), and especially the sections headed POSIX and Exegesis. I'm fairly sure that will account for what you're seeing. It is very subtle. It also took 30 years to actually see that sort of thing happening — Linux finally opted to exploit the sloppiness (at least, that's what it feels like from 'out here'). – Jonathan Leffler Sep 28 '18 at 23:00
  • @JonathanLeffler Very good-written answer there. Thanks a lot – Symphony Huang Sep 29 '18 at 01:28

1 Answers1

0

As said in the comments already you need to close open file descriptors before calling exec.

In this blogpost (section 4) there is a neat code sample you can use to ensure all fds are closed even in complex applications where you don't always know what files are open at the moment:

for ( i=getdtablesize(); i>2; --i) 
close(i); /* close all descriptors */

(slightly modified to keep stdin, stdout, stderr open)

It's kind of hacky but it works. If you want to avoid that you can also set the O_CLOEXEC flag on each file descriptor that you open. Since when using fopen you do not directly call open() you can accomplish this by adding the 'e' flag to it (when using glibc >= 2.7):

FILE* fp = fopen("test.txt", "er");

When calling exec*() all file descriptors with this flag are automatically closed.

Richard
  • 1,117
  • 11
  • 31
  • Just for completeness: "*can accomplish this by adding the 'e' flag*" this is a GNU extension coming with glibc 2.7 only (see the NOTES section [here](http://man7.org/linux/man-pages/man3/fopen.3.html)). – alk Sep 29 '18 at 09:49
  • Right. I added it to the answer. – Richard Sep 29 '18 at 09:56