11

I am creating a multi process program. When I tried to call fork() in a for loop using if(f == 0) break;. I got the desired number of child processes.

However now, I am dealing with an input file, and the desired number of processes is not known initially. Here is the smallest possible example of my code.

FILE* file = fopen("sample_input.txt", "r");
while(fscanf(file, "%d", &order) == 1){      
    f = fork();
    if(f == 0){
        break;
    } 
}

example sample_input.txt:

5 2 8 1 4 2

Now thousands of child processes are being created (I want 6, the number of integers in the file), what could be the reason ? Is it something to do with the file pointer ?

Edit: I did some debugging with console outputs, the child processes are indeed breaking out of the loop. However the parent keeps reading a small file over and over. If I remove fork(), the loop executes 6 times as intended.

Edit2: I have a theory, I can't prove it maybe you can help me. It could be the situation that the file pointer is shared between processes, when a child exits, it closes the file and when the parent tries to read again, it just starts from the beginning (or some other weird behavior). Could it be the case ?

Max Paython
  • 1,595
  • 2
  • 13
  • 25
  • @Someprogrammerdude What do you mean by definitive bug ? – Max Paython May 09 '18 at 03:10
  • You can try reseting errno at end of loop and check at start of loop to see you get anything when your loop reads beyond 6 – Pras May 09 '18 at 03:18
  • [This simple program](https://gist.github.com/pileon/9441a2b15ea498191715cda13c966ca1) replicates your behavior. *However*, if I uncomment the line with `fclose` it will work fine. Unless you need the file after the loop, you could "fix" your problem simply by closing it. – Some programmer dude May 09 '18 at 03:29
  • @Someprogrammerdude why do you think closing it early(in the child process) fixes it ? – Max Paython May 09 '18 at 03:37
  • 1
    To address your second theory directly: A child process with a file handle pointing at the same kernelspace structure exiting **does not** cause any kind of undesired/unusual/&c. behavior. – Charles Duffy May 09 '18 at 03:51
  • 2
    @CharlesDuffy: until last week, I'd have agreed with you. However, see [Why does forking my process cause the file to be read infinitely?](https://stackoverflow.com/questions/50110992/why-does-forking-my-process-cause-the-file-to-be-read-infinitely/50112169#50112169), which shows the the GNU C Library on Linux behaves peculiarly. – Jonathan Leffler May 09 '18 at 03:55
  • @JonathanLeffler Please note that I too tested this C code in a Ubuntu 16.04 LTS in a VMWare system. – Max Paython May 09 '18 at 04:05
  • I think that this is really a duplicate of the cross-referenced question. Does anyone see a reason not to close it as a duplicate? – Jonathan Leffler May 09 '18 at 04:11
  • @JonathanLeffler I agree that it seems to be the same root problem as the other question. But if the OP doesn't think it answers his question, then I think the question should remain open. – ImprobabilityCast May 09 '18 at 04:31
  • If glibc did that unconditionally/regularly, this would be a well-known issue rather than a (rather astonishing) niche one. Curious as to the trigger. – Charles Duffy May 09 '18 at 04:34
  • @CharlesDuffy: I've created GLIBC Bug 23151 (see update to my answer for URL). – Jonathan Leffler May 09 '18 at 06:18
  • Have you tried using protection? – Jordan May 09 '18 at 07:36

2 Answers2

12

When the first process reads the first number, it actually reads the whole line into memory. The process forks.

The child process breaks the loop; what happens next is not specified, but it probably exits. The parent process now reads the second number and forks again. Again, the child exits and the parent reads the third number, forks, etc.

After the sixth number is read and the sixth child exits, the parent goes to read another buffer from the file. On Linux (or, more precisely, with the GNU C Library), you then get some weird effects. See the discussion in Why does forking my process cause the file to be read infinitely? to see the details. However, the children exiting adjust the read position of the file descriptor back to the start, so the parent can read more data again.

My answer to the other question shows that if the child processes close the file before exiting, this behaviour does not occur. (It shouldn't occur anyway, but it does, empirically.)


GLIBC Bug 23151

GLIBC Bug 23151 - A forked process with unclosed file does lseek before exit and can cause infinite loop in parent I/O.

The bug was created 2019-05-08 US/Pacific, and was closed as INVALID by 2018-05-09. The reason given was:

Please read http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05_01, especially this paragraph:

Note that after a fork(), two handles exist where one existed before. […]

Please see Why does forking my process cause the file to be read infinitely? for an extensive discussion of this.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • I think this pretty much answers the question. However, it would be nice to find the root of the problem, maybe help fixing it, preventing this weird effect in the first place. – Max Paython May 09 '18 at 04:37
  • 1
    I guess the simplest thing to do is to create a bug against the GNU C Library using your code and/or the code in the other Q&A. In theory, we should verify against the bug lists that this is a new bug, but it is tempting just to enter the bug anyway. The bug report could sensibly cross-reference both questions on SO, but should include the code of at least one of the programs that illustrates the problem. – Jonathan Leffler May 09 '18 at 04:45
  • 1
    Bug 23151 has been rejected as invalid. I've put the discussion and explanation of what that is about in the [other question](https://stackoverflow.com/questions/50110992/) and closed this as a duplicate of that. It's all a set of non-obvious consequences. In this code, you should probably do `fflush(file)` or `fflush(NULL)` before the `fork()`. – Jonathan Leffler May 09 '18 at 15:24
-1

Number of times reading every character in a text file is equal to number of process created. Total Number of Processes = 2n where n is number of fork system calls. So here n = 3, 2^3 = 8

Let us put some label names for the three lines:

fork ();   // Line 1
fork ();   // Line 2
fork ();   // Line 3

      L1       // There will be 1 child process 
   /     \     // created by line 1.
  L2      L2    // There will be 2 child processes
 /  \    /  \   //  created by line 2
L3  L3  L3  L3  // There will be 4 child processes 
            // created by line 3

Example:

int main()
{
fork();
fork();
fork();
printf("Gwapo ko\n");
return 0;
}

Output:

Gwapo ko
Gwapo ko
Gwapo ko
Gwapo ko
Gwapo ko
Gwapo ko
Gwapo ko
Gwapo ko

Check this another example:

void forkexample() 
{
// child process because return value zero
if (fork()==0)
    printf("Hello from Child!\n");

// parent process because return value non-zero.
else    
    printf("Hello from Parent!\n"); 
}
int main()
{ 
forkexample();
return 0; 
}

Output:

1.
Hello from Child!
Hello from Parent!
     (or)
2.
Hello from Parent!
Hello from Child!

A child process is created, fork() returns 0 in the child process and positive integer to the parent process. Here, two outputs are possible because parent process and child process are running concurrently. So we don’t know if OS first give control to which process a parent process or a child process will be closed.

Important: Parent process and child process are running same program, but it does not mean they are identical. OS allocate different data and state for these two processes and also control flow of these processes can be different .

Theory: It might be that the closed process is a child process and not the parent process, leaving the parent process and other child processes.

Master James
  • 318
  • 5
  • 18