11

In the book Practical C Programming, I find that the combination of fgets() and sscanf() is used to read input. However, it appears to me that the same objective can be met more easily using just the fscanf() function:

From the book (the idea, not the example):

int main()
{
    int age, weight;
    printf("Enter age and weight: ");

    char line[20];
    fgets(line, sizeof(line), stdin);
    sscanf(line, "%d %d", &age, &weight);

    printf("\nYou entered: %d %d\n", age, weight);
    return 0;
}

How I think it should be:

int main()
{
    int age, weight;
    printf("Enter age and weight: ");
    fscanf(stdin, "%d %d", &age, &weight);

    printf("\nYou entered: %d %d\n", age, weight);
    return 0;
}

Or there is some hidden quirk I'm missing?

ankush981
  • 5,159
  • 8
  • 51
  • 96

3 Answers3

18

There are a few behavior differences in the two approaches. If you use fgets() + sscanf(), you must enter both values on the same line, whereas fscanf() on stdin (or equivalently, scanf()) will read them off different lines if it doesn't find the second value on the first line you entered.

But, probably the most important differences have to do with handling error cases and the mixing of line oriented input and field oriented input.

If you read a line that you're unable to parse with sscanf() after having read it using fgets() your program can simply discard the line and move on. However, fscanf(), when it fails to convert fields, leaves all the input on the stream. So, if you failed to read the input you wanted, you'd have to go and read all the data you want to ignore yourself.

The other subtle gotcha comes in if you want to mix field oriented (ala scanf()) with line oriented (e.g. fgets()) calls in your code. When scanf() converts an int for example, it will leave behind a \n on the input stream (assuming there was one, like from pressing the enter key), which will cause a subsequent call to fgets() to return immediately with only that character in the input. This is a really common issue for new programmers.

So, while you are right that you can just use fscanf() like that, you may be able to avoid some headaches by using fgets() + sscanf().

FatalError
  • 52,695
  • 14
  • 99
  • 116
  • This is true. On the other hand, you would add some headache because `fgets` would read a line with a maximum length which can cause trouble. Imagine your buffer is 10 chars (exaggerated just for example) and then given input: `" 12345\n"` which you would receive as `" 12"` and `"345\n"`. The most secure way is to use a loop over `fgets` and `realloc` to read a complete line no matter its size. – Shahbaz Mar 11 '14 at 16:58
  • @Shahbaz I'm still new to C, but I tried this and found that "12345" doesn't get split up the way you explained ("12" and "345\n"). Instead, I got "12345" in the first variable and some negative garbage value in the second. What point were you trying to make? Why should this happen? – ankush981 Mar 11 '14 at 17:07
  • Shahbaz: for simple problems, choose a large enough buffer, read the first characters and ignore the others in a loop until you read a `'\n'` :) – pmg Mar 11 '14 at 17:10
  • @dotslash, with `fgets` you have to give a bounded buffer. If there is a number at the boundary of that size, it can get split. In your own example, try giving an input with 17 spaces followed by a 6 digit number. – Shahbaz Mar 11 '14 at 17:21
  • @pmg, for simple problems, check for output of `scanf("%d", ...)` and if it's not 1, do `scanf("%*s")` and retry. For simple problems, it's easy. It's real problems that show how `fscanf` can be troublesome and as I point out `fgets` can get troublesome too. – Shahbaz Mar 11 '14 at 17:22
  • @Shahbaz Do not agree about "read a complete line no matter its size.". Certainly a buffer should be of ample size - I prefer 2x typical max size. But these days there are too many buffer attacks and thus allowing unlimited size is a way to crash a program. Better to limit input length to a generous amount rather than unbounded. – chux - Reinstate Monica Mar 11 '14 at 17:22
  • 1
    @chux, sure you can generate a failure if a line started to get ridiculously large. What I'm saying is that if you want to keep a maximum size, which is fine, you'd then have to be sure the line you read has fit in the buffer, otherwise the data may not be reliable and you may need to ignore what part of the line you have already read as well as what part of it is still left. By the way, crashing a program by exhausting its memory is not really an attack. You can just [limit its memory](http://stackoverflow.com/q/4651234/912144), or send a KILL signal or whatever! It's not a security issue. – Shahbaz Mar 11 '14 at 18:02
  • @Shahbaz You do raise some really interesting points! Thanks for this. :) – ankush981 Mar 12 '14 at 16:29
6

The problem with only using fscanf() is, mostly, in error management.

Imagine you input "51 years, 85 Kg" to both programs.

The first program fails in the sscanf() and you still have the line to report errors to the user, to try a different parsing alternative, to something;

The second program fails at years, age is usable, weight is unusable.

Remeber to always check the return value of *scanf() for error checking.

    fgets(line, sizeof(line), stdin);
    if (sscanf(line, "%d%d", &age, &weight) != 2) /* error with input */;

Edit

With your first program, after the error, the input buffer is clear; with the second program the input buffer starts with YEAR...

Recovery in the first case is easy; recovery in the second case has to go through some sort of clearing the input buffer.

pmg
  • 106,608
  • 13
  • 126
  • 198
3

There is no difference between fscanf() versus fgets()/sscanf() when:

  1. Input data is well-formed.

Two types of errors occur: I/O and format. fscanf() simultaneously handles these two error types in one function but offers few recovery options. The separate fgets() and sscanf() allow logical separation of I/O issues from format ones and thus better recovery.

  1. Only 1 parsing path with fscanf().

Separating I/O from scanning as with fgets/sscanf allows multiple sscanf() options. Should a given scanning of a buffer not realize the desired results, other sscanf() with different formats are available.

  1. No embedded '\0'.

Rarely does '\0' occurs, but should one occur, sscanf() will not see it as scanning stops with its occurrence, whereas fscanf() continues.

In all cases, check results of all three functions.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256