Stopping a while loop when using fscanf?

Question

I am running into this issue: I have a .txt file with space delimited datapoints, and I am using fscanf to take those datapoints and store them into an array of floats. The trouble is that I am also trying to deal with bad inputs (for example, 12.3 is valid but a.2, e, e.f, a.2 etc. are invalid) so I have set up the following:

float data[20];
int count = 0;
FILE *fp;
fp = fopen(argv[1], "r");
while ((count < 20)) {
    if fscanf(fp, "%f", &data[count]);
    else {
        data[count] = -1000;
    }
}

Currently, I have it hard-coded to end at 20 loops. But I need it to end at the end of the data in the file, which has 11 values. Presently it loads those values properly, but continues looping. I tried an EOF implementation, but I ran into issues with the bad data handling.

This code will not compile. Please copy and paste your actual code, as well as the error message. You should also describe what is the expected behavior. — SGeorgiades, Mar 04 '22 at 18:31
`while(count < 20 && fscanf(fp, "%f", &data[count]) == 1) { count++; }`. But if you are trying to deal with bad data, move away from `fscanf` to using `fgets` and `sscanf`. Or if they are space-separated on one line, use `%s` and then extract the `float` from a string. — Weather Vane, Mar 04 '22 at 18:31
@WeatherVane Okay, I will switch to fgets/sscanf. Thank you! — smallvt, Mar 04 '22 at 18:34
I edited that comment after seeing there isn't one value per line. You can read as space-delimited strings. — Weather Vane, Mar 04 '22 at 18:34
Do you just want it to stop after 11 data points, or do you want it to continue until EOF is reached? If the latter, what do you want to happen if there are more datapoints than the size of your array? What do you want to happen when you encounter an invalid value? — SGeorgiades, Mar 04 '22 at 18:34
@SGeorgiades sorry- I had forgotten ""int count = 0;"". But, I was more just curious about the specific workings of EOF/fscanf, and figured my quick snippet would show enough to contextualize the question. — smallvt, Mar 04 '22 at 18:35
Aside: never use the implicit `true` test from `scanf` functions. Test for the specific number of conversions needed, `1` in this case. — Weather Vane, Mar 04 '22 at 18:37
@SGeorgiades WeatherVane has pointed out for me that with fscanf / bad data handling I should switch to fgets/sscanf. Thank you for your help though, but I believe my problem for this question is resolved — smallvt, Mar 04 '22 at 18:37
You'll need to be more rigorous too: Converting say `1.a` with the simple application of `scanf` functions alone won't reveal an error. — Weather Vane, Mar 04 '22 at 18:39
Yup, I have things in the works for that later. I have already written this program in c++ and ran into a similar issue, but some of the conversions in the code have been a little weird. Thanks again — smallvt, Mar 04 '22 at 18:44
I'd use `fgets` but loop on `strtod` and check the trailing char (it should be only space or newline). It's faster and does more error checking. — Craig Estey, Mar 04 '22 at 20:09
@smallvt "deal with bad inputs " --> OK once bad input is detect, what do you want code to do? Quit? — chux - Reinstate Monica, Mar 04 '22 at 21:57
@smallvt "but I believe my problem for this question is resolved" --> Either then post your own answer or delete the question. — chux - Reinstate Monica, Mar 04 '22 at 21:58

Neil · Answer 1 · 2022-03-05T03:11:26.297

I find reading the fscanf documentation always helps. What it says to me is that it returns one of three values, (in your case.) One has to account for all three if one's programme is well-formed.

return	meaning
1	The parser was able to reasonably interpret and store the one value given, within the data type's limits.
0	These bytes do not make sense given the format string.
`EOF`	the character couldn't be read for some other reason; it may be the end of the file.

To avoid duplicating the system of errors, perhaps #include <errno.h> would be efficient?
I like to put related data in a structure, struct { size_t count; double data[20]; } join; one can query the size by const size_t join_max = sizeof join.data / sizeof *join.data;.
Be careful to test on the edge-case where there are 20 floats.
fopen is an added complexity that shouldn't have to be there on simple command line programmes, the shell can usually do this nicer, ./a.out < data.
See this discussion on float vs double.

In your case, reading one double might look like:

switch(scanf("%lf", &lf)) {
case EOF: if(feof(stdin)) goto eof; else goto catch;
case 0: errno = EILSEQ; goto catch;
case 1:
    if(join.count < join_max) join.data[join.count++] = lf;
    else { errno = ERANGE; goto catch; }
    break;
}

--

"What's up with all the gotos?"

One at least has to account for these conditions:

The result was added, try again.
The file is finished, success.
Read error, syntax error, or there is no space to hold more in the array; (depending on the application, I think these can be grouped together.)

It is easier if one has a standard out-of-band mechanism like exceptions. I can tell you how my code went, but this is just an quickly made up example, and not necessarily good design:

{
    int success = EXIT_FAILURE;
    (loop)
        (scanf)
eof:
    (print)
    { success = EXIT_SUCCESS; goto finally; }
catch:
    sprintf(reason, "Data %lu", join.count);
    perror(reason);
finally:
    return success;
}

Good point; they might be a state machine, an extra variable, an `enum`, or anything, depending on the style you want; I think the important thing is you check the three possible return values. — Neil, Mar 05 '22 at 03:18

Stopping a while loop when using fscanf?

1 Answers1