Basic answer
Given that the data file is 73 bytes long (give or take — you might have extra white space around that I didn't guess at), the first call to fscanf()
will read the whole file into memory. The parent process then reads 10 lines worth from memory, moving the read pointer in the standard I/O buffer. The trailing newlines in the fscanf()
format strings are not really needed; the %d
skips white space, which includes newlines, and if the input were not coming from a file, the trailing blank line would be a very bad UX — the user would have to type the (start of the) next number to complete the current input. (See scanf()
leaves the newline in the buffer and What is the effect of trailing white space in a scanf()
format string?.)
Then the process forks. The child is an exact copy of the parent, so it continues reading where the parent left off, and prints 10 numbers as you expected, and then exits.
The parent process then resumes. It has done nothing to change the position of the pointer in memory, so it continues where it left off. However, the reading code now reads single characters and prints their decimal values, so it gets 50,
57,
10 — the character codes for '2'
, '9'
, and '\n'
. And so the output continues for all the rest of the prime numbers in the input.
You really need to fix the input to resume using fscanf()
instead of fgetc()
.
There isn't a sensible way for the parent to know what the child has done other than by changing from buffered I/O to unbuffered I/O. If you switched to unbuffered I/O, by calling setbuf(fichier, NULL);
or setvbuf(fichier, NULL, _IONBF, 0);
after opening the file but before doing any other operation on the file stream, then you would see that the parent process continues where it left off.
A side-note: I'm not convinced about the loop in create_process()
— if there aren't enough resources, at least wait a little to give the system time to find some, but it is more common to treat 'out of resources' as a fatal error.
Another side-note: sending a signal to a process that's already died (because you waited for it to die) isn't going to work.
Here's some revised code:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
static pid_t create_process(void)
{
pid_t pid = fork();
if (pid < 0)
{
fprintf(stderr, "Failed to fork\n");
exit(1);
}
return pid;
}
int main(void)
{
const char filename[] = "entiers.txt";
FILE *fichier = fopen(filename, "r");
int i = 0;
int n = 0;
// setbuf(fichier, NULL);
// setvbuf(fichier, NULL, _IONBF, 0);
if (fichier == 0)
{
fprintf(stderr, "Failed to open file '%s' for reading\n", filename);
exit(1);
}
printf("I am the parent process with the pid %d\n", getpid());
for (i = 0; i < 10; i++)
{
if (fscanf(fichier, "%d", &n) != 1)
break;
printf("%d\n", n);
}
pid_t pid = create_process();
if (pid == 0)
{
printf("I am the child process with the pid %d\n", getpid());
for (i = 0; i < 10; i++)
{
if (fscanf(fichier, "%d", &n) != 1)
break;
printf("%d\n", n);
}
}
else
{
wait(NULL);
printf("I am the parent process with the pid %d\n", getpid());
while (fscanf(fichier, "%d", &n) == 1)
printf("%d\n", n);
}
fclose(fichier);
return EXIT_SUCCESS;
}
Sample output:
I am the parent process with the pid 15704
2
3
5
7
11
13
17
19
23
29
I am the child process with the pid 15705
31
37
41
43
47
53
59
61
67
71
I am the parent process with the pid 15704
31
37
41
43
47
53
59
61
67
71
73
79
83
89
97
Very often, questions like this involve file descriptor I/O and the discussion has to cover the different between an open file descriptor and an open file description and explain what's shared between processes and what isn't. Because the input file is so small, that isn't an issue with this code. If the table of primes went up to, say, 999983 (the largest prime smaller than a million), and the child process read much more data, then you'd see different effects altogether.
Unbuffered input and trailing newlines in scanf()
format strings
Empirical observation shows that when the original version of the code shown above had scanf("%d\n", &n)
in both the parent's first read loop and the child's read loop, and the program was configured to use unbuffered input, the output would look like:
…
67
71
I am the parent process with the pid 27071
33
79
…
where the 33
isn't expected at first glance. However, there is an explanation for what goes wrong.
There's at least one byte of pushback available on the stream (even with no buffering), so at the point where the parent forks, both parent and child have the 3
from 31
in the pushback position (because the newline was read as a white space character and the first non-blank, aka the 3
of the line containing 31
was read and pushed back into the input buffer).
The child is an almost exact copy of the parent, and reads the pushback character and continues with the 1
and gets the newline and then the 3
of 37
, and prints 31
as you'd expect. This continues until it reads the 7
at the start of 73
and pushes it back into its own input buffer, but that has no effect on the parent's input buffer of course (they're separate processes). The child exits.
The parent resumes. It has a 3
in its pushback position, and then gets the 3
from 73
(because the parent and child share the same open file description, and the read position is associated with the description, not the descriptor, so the child has moved the read position), and then it gets a newline and and terminates its scanning (the last loop was missing the trailing white space in the scanf()
format string anyway), and prints 33
correctly. It then proceeds to read the rest of the input cleanly, skipping over white space (newline) before reading each number.
Changing the code to use fscanf(fichier, "%d", &n)
throughout means that the child process stops with the newline before 73
in its pushback buffer, and the read position pointing at the 7
of 73
, which is exactly where the parent needs it.
If the first parent loop had omitted the newline in the fscanf()
format, then the child would still have worked, but the parent would have reported 3
as the first number when it resumed, instead of 33
.