1

I'm a little desperate because I don't know how to create a program that only reads some data (words or numbers) from an input file and then with this data writes another file but putting them in a tabulated order ... I don't know how to make the program look in the line of the input file for example "number of sequences: 2" and to make that it takes only the data "2" to be stored in the new file ... Please help me I'm just starting

Thank you all

  • 1
    Does this answer your question? [Read text file in C](https://stackoverflow.com/questions/49933210/read-text-file-in-c) – Adam Jun 21 '20 at 14:32
  • thank you, it could be a solution...But I don't know how to use strtok –  Jun 21 '20 at 14:44
  • Looking at "number of sequences: 2" you only want to read the "2". How do you know? How do you as a human decide that only that is the relevant data? What makes your recognise that? You will have to find out about your own thinking and then explain to us. Is it the 22nd character in the line? Is it last character in the line? If you cannot answer that question then it will not be possible to make a computer do that. Because computers are much more stupid than humans, just much faster so. – Yunnosch Jun 21 '20 at 14:45
  • yes, it's exactly the 22nd character in the line –  Jun 21 '20 at 14:48

1 Answers1

1

The issue you are having is not with the loop, and not with the eof.

The real issue is you have incorrect parsing logic.

Your input file is not uniformed:

  • Different session lines have different "MODE" in them
  • Number of blank lines varies from group to group
  • Blank lines may actually contain any number of space characters
  • "Number of sequences" line appears in different places in different groups

To parse such a file you need a more flexible logic that will check each input line, collect all the data needed to build an output line, and only then print it to the output file.

To do this, you can use one loop reading only one line at a time, and then testing its contents using the strncmp function.
Once you identified the type of data the line contains, save it to a variable using sscanf function.

Here is the code that will do the job:

#include <stdio.h>
#include <string.h>

int main(int argc, char **argv) {
    FILE *file_in, *file_out;
    char line[200];

    /* intialize these just in case we want to validate the input file */
    int current_session = 0;
    int current_sequences = 0;
    int current_registration = 0;

    /* these arrays can probably be smaller */
    char chars_given[200] = { 0 };
    char chars_recognized[200] = { 0 };

    file_in = fopen("summary.txt", "r");
    if (file_in == NULL) {
        perror("Error opening input file");
        return 1;
    }

    file_out = fopen("ordinated.txt", "w");
    if (file_out == NULL) {
        perror("Error opening output file");
        return -1;
    }

    while (fgets(line, 200, file_in) != NULL) {
        /* check if this is start of session using safe string comparison */
        if (strncmp(line, "session", strlen("session")) == 0) {
            sscanf(line, "session %d", &current_session);
        } else if (strncmp(line, "number of sequences", strlen("number of sequences")) == 0) {
            sscanf(line, "number of sequences: %d", &current_sequences);
        } else if (strncmp(line, "registration", strlen("registration")) == 0) {
            sscanf(line, "registration %d", &current_registration);
        } else if (strncmp(line, "characters given", strlen("characters given")) == 0) {
            sscanf(line, "characters given: %s", chars_given);
        } else if (strncmp(line, "characters recognized", strlen("characters recognized")) == 0) {
            sscanf(line, "characters recognized: %s", chars_recognized);
        } else {
            /* This is a line with no information (blank or separator).
               Time to print results we collected, and reset the variables
               for the next set of results. */
        
            /* check we have enough information to output a line */
            if (current_session > 0 && current_sequences > 0 &&
                current_registration > 0 && strlen(chars_given) > 0) {
            
                /* check if anything was recognized */
                if (strlen(chars_recognized) > 0) {
                    fprintf(file_out, "%d %d %d %s %s\n", current_session, current_registration,
                        current_sequences, chars_given, chars_recognized);
                } else { /* one less parameter to output if nothing was recognized */
                    fprintf(file_out, "%d %d %d %s\n", current_session, current_registration,
                        current_sequences, chars_given);
                }
            
                /* Now reset for next time. If you don't do this, the output line will repeat */
                current_registration = 0;
                chars_given[0] = '\0';
                chars_recognized[0] = '\0';
             }
        }
    }

    /* the last block may not be printed in the loop if there is no empty line after it */
    if (current_session > 0 && current_sequences > 0 &&
        current_registration > 0 && strlen(chars_given) > 0) {
            
        /* check if anything was recognized */
        if (strlen(chars_recognized) > 0) {
            fprintf(file_out, "%d %d %d %s %s\n", current_session, current_registration,
                current_sequences, chars_given, chars_recognized);
        } else { /* one less parameter to output if nothing was recognized */
            fprintf(file_out, "%d %d %d %s\n", current_session, current_registration,
                current_sequences, chars_given);
        }
    }

    fclose(file_in);
    fclose(file_out);

    return 0;
}

This code is a bit ugly, but I tried to keep it simple.

It can be cleaned up by using structures, some flags, and moving some of the code to separate functions.

Edit: this code omits sanity checks for simplicity, and assumes the input file is not corrupt, i.e. first non empty line is always session, lines contain all the information they should, etc.

Lev M.
  • 6,088
  • 1
  • 10
  • 23
  • `if (strncmp(line, "session", strlen("session")) == 0) { sscanf(line, "session %d", &current_session);` has a weakness as later code uses `current_session` without knowing `sscanf()` succeeded. Like-wise weaknesses in other code parts. Consider `if (sscanf(line, "session %d) == 1) { ; }` or the like. An improvement over the first answer. – chux - Reinstate Monica Jun 24 '20 at 23:06
  • @chux-ReinstateMonica you are correct, I did not put in any input validation, because the OP was complaining about complexity in another solutions in a second question about this same problem (it got wrongly closed as duplicate). I added a warning about this to the answer. – Lev M. Jun 24 '20 at 23:13