0

I am writing a C code to read floating point numbers from a text file. The text file looks like this:

202.75 47.22 141.20
202.75 47.22 -303.96
202.75 47.22 -301.67
202.75 47.22 482.42
...

There are 19973 lines in this file and the C code snippet that reads the text file is

nLinesToRead = 19973;
x  = (float *)calloc(nLinesToRead, sizeof(float));
y = (float *)calloc(nLinesToRead, sizeof(float));
z  = (float *)calloc(nLinesToRead, sizeof(float));
ptr = fopen(fileName, "r");
for(i=0; i<nLinesToRead; i++)   {
    status = fscanf(ptr, "%f%f%f", &x[i], &y[i], &z[i]);
    if(feof(ptr) && i<nLinesToRead)
        printf("\nERROR: encountered unexpected EOF in line %d", i+1);
    if(status != 3) {
        printf("\nERROR: Error reading valid pixels from the disk at line %d with status %d and %d\n", i+1, status, feof(ptr));
        printf("\nEOF is %d\n", EOF);
        exit(FAILURE);
    }
}

The output of this code is

ERROR: encountered unexpected EOF in line 19940
ERROR: encountered unexpected EOF in line 19941
ERROR: Error reading valid pixels from the disk at line 19941 with status -1 and 0

indicating that fscanf is encountering EOF at an unexpected location. Looking at lines 19939 through 19942

202.21 47.23 -453.42
202.21 47.23 -445.81
202.21 47.23 -419.89
202.21 47.41 179.25

I don't see anything weird there.

Has anyone encountered this before?

  • This is kind of a dumb question, but are you sure there are that many lines? – bentank May 06 '15 at 21:58
  • Yes... wc -l gives 19973 –  May 06 '15 at 21:59
  • Do you see this happening every time at the same location. Also, post your input file somewhere I can grab it if you don't mind. – bentank May 06 '15 at 22:00
  • Did you `#include `? – R Sahu May 06 '15 at 22:01
  • unrelated, what possible condition can you think of that would cause the second half of `if(feof(ptr) && i – WhozCraig May 06 '15 at 22:03
  • While use of `if(feof(ptr) && i – R Sahu May 06 '15 at 22:03
  • Is your input file terminated by `CRLF` somewhere within that those lines? Try converting it to pure `LF` and see if it makes any difference. – alvits May 06 '15 at 22:05
  • [don't cast the return of calloc or malloc](http://makecleanandmake.com/2014/07/26/how-to-malloc-the-right-way/) – Ryan Haining May 06 '15 at 22:08
  • @R Sahu: Yes stdlib.h is included. @alvits: The input file was created with a simple python script that uses a print statement inside a for loop. So I would not expect to have different line endings... –  May 06 '15 at 22:11
  • `x = calloc(nLinesToRead, sizeof *x);` is all you need. (see Ryan's comment above) @user1670806 - it sounds like a problem with your input file. Are there any numbers missing whitespace between them? – David C. Rankin May 06 '15 at 22:13
  • Running it here. It works fine. There is a small logic error as you should be checking `if(feof(ptr) && i != nLinesToRead -1)`. But I am able to read all through until real EOF. – bentank May 06 '15 at 22:19
  • Check your data for fields that have run together. Or read line-by-line with `fgets()` and then use `sscanf()` on each line. – Andrew Henle May 06 '15 at 22:23
  • @bentank: I am compiling the code with gcc 4.8.2 on ubuntu 14.04. I'll give it a try on a different machine. –  May 06 '15 at 22:24
  • @user1670806 Confirmed, datafile is fine, `19973` lines read with 3 floats per line. Why are you reading with a `for` loop? A `while` loop with `fgets` or `getline` will allow you to read an indeterminate number of lines. – David C. Rankin May 06 '15 at 22:29
  • Can you try reading `errno` in your handler for the fail case? It is not standard for it to get set but some implementations will do so. – bentank May 06 '15 at 22:31
  • Out of curiosity, what happens when you [turn this loose](http://pastebin.com/UchKm41A) on the same file, renamed to `pixels.txt" (obvioiusly)? – WhozCraig May 06 '15 at 23:16
  • Use the value returned by `scanf` to decide whether the loop should continue. Use `feof()` only to determine afterward whether it failed due to and end-of-file condition or an error. – Keith Thompson May 07 '15 at 00:06

1 Answers1

1

While you are finding another computer, let me suggest another way to read the file instead of hardwiring a for loop to read X number of lines. Generally when you are reading lines of data, you are better off using line-oriented input methods like fgets or getline to read an entire line at a time into a buffer, and then parse the buffer to get the individual items you need. That way rather than trying to shoehorn your data into a fscanf read, you get the entire line every time, with any failure being limited to a parse of the buffer rather than the read of the file.

Below is a quick bit of code that seems to do what it is you are attempting to do. The only suggestion I would have is that when reading 3 common pieces of data that are associated, you are probably better off creating a struct that contains the 3 members rather than creating 3 separate arrays to hold the data. That will simplify passing the data to functions, etc.. Either way works, so it is up to you. Let me know if you have questions:

#include <stdio.h>
#include <stdlib.h>

#define MAXL 48
#define MAXF 20000

int main (int argc, char **argv) {

    if (argc < 2 ) {
        fprintf (stderr, "error: insufficient input, usage: %s number\n", argv[0]);
        return 1;
    }

    FILE *fp = NULL;
    float *x = NULL;
    float *y = NULL;
    float *z = NULL;
    size_t idx = 0;
    char ln[MAXL] = {0};

    /* open file with filename given on command line */
    if (!(fp = fopen (argv[1], "r"))) {
        fprintf (stderr, "error: file open failed '%s'.", argv[1]);
        return 1;
    }

    /* allocate memory for arrays x, y, z (consider a struct) */
    if (!(x = calloc (MAXF, sizeof *x))) {
        fprintf (stderr, "error: virtual memory exhausted");
        return 1;
    }

    if (!(y = calloc (MAXF, sizeof *y))) {
        fprintf (stderr, "error: virtual memory exhausted");
        return 1;
    }

    if (!(z = calloc (MAXF, sizeof *z))) {
        fprintf (stderr, "error: virtual memory exhausted");
        return 1;
    }

    /* read each LINE in file and parse with sscanf for 3 floats */
    while (fgets (ln, MAXL, fp) != NULL)
    {
        if (sscanf (ln, "%f%f%f", &x[idx], &y[idx], &z[idx]) == 3) {
            idx++;
            if (idx == MAXF) {
                fprintf (stderr, "warning: %d lines read.\n", MAXF);
                break;
            }
        }
        else
            printf ("error: line %zu, failed to read 3 floats.\n", idx);
    }

    printf ("\n read '%zu' lines.\n\n", idx);

    size_t i = 0;
    for (i = 19938; i < 19942; i++)
        printf (" line[%zu] : %.2f %.2f %.2f\n", i + 1, x[i], y[i], z[i]);
    printf ("\n");

    free (x);
    free (y);
    free (z);

    fclose (fp);

    return 0;
}

Output

$ ./bin/fgets_sscanf_floats_dyn dat/testFile.txt

 read '19973' lines.

 line[19939] : 202.21 47.23 -453.42
 line[19940] : 202.21 47.23 -445.81
 line[19941] : 202.21 47.23 -419.89
 line[19942] : 202.21 47.41 179.25
David C. Rankin
  • 81,885
  • 6
  • 58
  • 85