5

I'm making a program for school where I have a multiprocess program where each process reads a portion of a file and they work together to count the number of words in the file. I'm having an issue where if there are more than 2 processes, then all of the processes read EOF from the file before they've read their portion of the file. Here's the relevant code:

#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>

int main(int argc, char *argv[]) {

    FILE *input_textfile = NULL;
    char input_word[1024];
    int num_processes = 0;
    int proc_num = 0; //The index of this process (used after forking)
    long file_size = -1;

    input_textfile = fopen(argv[1], "r");
    num_processes = atoi(argv[2]);

    //...Normally error checking would go here

    if (num_processes > 1) {

        //...create space for pipes

        for (proc_num = 0; proc_num < num_processes - 1; proc_num++) {

            //...create pipes

            pid_t proc = fork();

            if (proc == -1) {
                fprintf(stderr,"Could not fork process index %d", proc_num);
                perror("");
                return 1;
            } else if (proc == 0) {
                break;
            }

            //...link up the pipes
        }
    }

    //This code taken from http://stackoverflow.com/questions/238603/how-can-i-get-a-files-size-in-c
    //Interestingly, it also fixes a bug we had where the child would start reading at an unpredictable place
    //No idea why, but apparently the offset wasn't guarenteed to start at 0 for some reason
    fseek(input_textfile, 0L, SEEK_END);
    file_size = ftell(input_textfile);
    fseek(input_textfile, proc_num * (1.0 * file_size / num_processes), 0);

    //read all words from the file and add them to the linked list
    if (file_size != 0) {

        //Explaination of this mess of a while loop:
        //  if we're a child process (proc_num < num_processes - 1), then loop until we make it to where the next
        //  process would start (the ftell part)
        //  if we're the parent (proc_num == num_processes - 1), loop until we reach the end of the file
        while ((proc_num < num_processes - 1 && ftell(input_textfile) < (proc_num + 1) * (1.0 * file_size / num_processes))
                || (proc_num == num_processes - 1 && ftell(input_textfile) < file_size)){
            int res = fscanf(input_textfile, "%s", input_word);

            if (res == 1) {
                //count the word
            } else if (res == EOF && errno != 0) {
                perror("Error reading file: ");
                exit(1);
            } else if (res == EOF && ftell(input_textfile) < file_size) {
                printf("Process %d found unexpected EOF at %ld.\n", proc_num, ftell(input_textfile));
                exit(1);
            } else if (res == EOF && feof(input_textfile)){
                continue;
            } else {
                printf("Scanf returned unexpected value: %d\n", res);
                exit(1);
            }
        }
    }

    //don't get here anyway, so no point in closing files and whatnot

    return 0;
}

Output when running the file with 3 processes:

All files opened successfully
Process 2 found unexpected EOF at 1323008.
Process 1 found unexpected EOF at 823849.
Process 0 found unexpected EOF at 331776.

The test file that causes the error: https://dl.dropboxusercontent.com/u/16835571/test34.txt

Compile with:

gcc main.c -o wordc-mp

and run as:

wordc-mp test34.txt 3

It's worth noting that only that particular file gives me issues, but the offsets of the error keep changing so it's not the contents of the file.

Evan Allan
  • 55
  • 7
  • Jonathan's guess may be correct, but you should always post a [Minimal Complete Verifiable Example](http://stackoverflow.com/help/mcve) when asking for debugging help. – user3386109 Mar 17 '16 at 19:08
  • Alright I'll work on getting that done – Evan Allan Mar 17 '16 at 19:10
  • @user3386109 Done. Link is in the edit text. – Evan Allan Mar 17 '16 at 19:35
  • I don't like being the bearer of bad news, but... You need to [edit] the question, and place all of the *necessary* information in the question itself. So I would remove the code that you currently have, and replace it with the MCVE code. As for the text file, it's probably *not necessary* to put that in the question. I assume that any old text file (containing words) will do. So leaving a link to the text file is OK. – user3386109 Mar 17 '16 at 19:37
  • @user3386109 Ah ok sorry about that. First question and all. Anyway, I put the code into the question. As for the text file, it works fine on most of them, but it has issue with this particular one, possibly because it's much larger than the others, but honestly I'm not sure why. Anyway, point is, it is essential to testing, but I'm going to leave the link because it's too long to put in the question itself. – Evan Allan Mar 17 '16 at 19:46
  • Ok, the question is in good form now. A quick look at the code, and I still think that Jonathan is right. I'll try to reproduce that problem, but I'm using OS X so we'll see if that makes any difference. The one difference that I'm aware of is that OS X will start running the children sooner than linux would, so I should be able to recreate the problem with smaller input files. That's my theory as to why only the large file causes problems: on linux each child will finish before the next one starts (for small files). – user3386109 Mar 17 '16 at 19:58
  • 1
    It would seem that even though we did the test in class, the issue was in fact that the file was being opened before the fork. I guess we got lucky during the test since the file was so small. Anyway, moving opening the file to later in the program was enough to fix my issue (although the code still doesn't work, but for other reasons now). Thank you guys for the help. – Evan Allan Mar 17 '16 at 20:14

1 Answers1

3

You have created your file descriptor before forking. A child process inherits the file descriptor which point to the same file description of the parent, and thus, advancing with one of the children make the cursor advance for all the children.

From "man fork", you can have the confirmation :

  • The child process is created with a single thread—the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.

  • The child inherits copies of the parent's set of open file descrip‐ tors. Each file descriptor in the child refers to the same open file description (see open(2)) as the corresponding file descriptor in the parent. This means that the two descriptors share open file status flags, current file offset, and signal-driven I/O attributes (see the description of F_SETOWN and F_SETSIG in fcntl(2)).

Jonathan Schoreels
  • 1,660
  • 1
  • 12
  • 20
  • I have done that, but we tested in class and I also tested myself that advancing in either the child or parent does not advance in any of the others. Plus, you can see what position each process was at in the file when it got to EOF, and none of them are the actual end of the file. – Evan Allan Mar 17 '16 at 19:08
  • Actually went ahead and tried moving when I opened the file and it fixed it. Not sure why it worked in the test in class, but anyway, thank you for the answer! And sorry for writing it off initially. – Evan Allan Mar 17 '16 at 20:15
  • 1
    As you can see, this is documented in the man page of fork. While forking preserve virtual memory (a whole copy is made), file descriptors are related to the system and and the file descriptor (which is a pointer to a file description) is copied but refer to the same description (which is not copied). – Jonathan Schoreels Mar 17 '16 at 20:37
  • Yeah sorry I saw that there and that's what I originally thought, but he showed the example in class which made me think otherwise. – Evan Allan Mar 17 '16 at 21:10