23
scanf(" %[^\n]",line);

A friend of mine suggested that using fgets() to read a line as input would be a much better idea than using scanf() as in the statement above. Is he justified?

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
amulous
  • 704
  • 2
  • 6
  • 15
  • 3
    it should be scanf(" %[^\n]",line); close the brace in the argument.. – Raghu Srikanth Reddy Jun 25 '13 at 10:27
  • 1
    He absolutely is. I always suggest using `fgets()` and avoiding `scanf()` completely. –  Jun 25 '13 at 10:27
  • @RaghuSrikanthReddy Perfect example of why `scanf()` is not to be used. –  Jun 25 '13 at 10:28
  • 1
    `fgets()` is better. But it is more like `scanf("%[^\n]",line);` (no space in the format). The OP's `" %[^\n]"` will consume leading whitespace without storing in `line`. `fgets()` with store the leading whitespace (that is not a `\n`). – chux - Reinstate Monica Jun 25 '13 at 12:22
  • 2
    Any `scanf` with `%s` or `%[` without a length specifier should be considered a bug. Now if you specified the length, it would have been an interesting question. – Cubbi Jun 30 '13 at 19:23

8 Answers8

25

char * fgets ( char * str, int num, FILE * stream ); is safe to use because it avoid buffer overflow problem, it scans only num-1 number of char.

Reads characters from stream and stores them as a C string into str until (num-1) characters have been read or either a newline or the end-of-file is reached, whichever happens first.

here second argument num is Maximum number of characters to be copied into str (including the terminating null-character).

For example suppose in your code a string array capacity is just 5 chars long as below.

 char str[5];
 fgets (str, 5, fp);  //5 =you have provision to avoid buffer overrun 

Using above code, if input from fp is longer then 4 chars, fgets() will read just first 4 chars then appends \0 (, and discard other extra input chars, just stores five char in str[]).

Whereas scanf(" %[^\n]",str); will read until \n not found and if input string is longer then 4 chars scanf() will cause of buffer overflow (as scanf will try to access memory beyond max index 4 in str[]).

Grijesh Chauhan
  • 57,103
  • 20
  • 141
  • 208
  • 1
    Thanks for the detailed explanation. Will scanf() be a better choice, if I don't know how long my input string will be? – amulous Jun 25 '13 at 10:41
  • @amulous : Mr. YuHao given a very good link. read there, `scanf()`, `printf()` series of functions are good for formatted input/output – Grijesh Chauhan Jun 25 '13 at 10:44
9

C FAQ has some detailed explanation about scanf's problem:

More generally, scanf is designed for relatively structured, formatted input (its name is in fact derived from "scan formatted"). If you pay attention, it will tell you whether it succeeded or failed, but it can tell you only approximately where it failed, and not at all how or why. You have very little opportunity to do any error recovery.

see here for detail.

Yu Hao
  • 119,891
  • 44
  • 235
  • 294
3

Simply put: yes, fgets is a better choice.

I looked at your scanf format specifier and I was mystified. Understanding exactly what it does requires some time reading through the man pages.

Also, your scanf code is susceptible to buffer overruns.

Keep it simple and you will reduce the maintenance costs and avoid hard to find bugs!

Klas Lindbäck
  • 33,105
  • 5
  • 57
  • 82
3

fgets will be better than this scanf. There may be following issues with scanf as given in OP

1)buffer overflow as suggested by @Grijesh

2)possibly next scanf after this won't work because the newline is left in the input stream.(if you miss a blank space)

Dayal rai
  • 6,548
  • 22
  • 29
3

Yes fgets is the better and safe way to read a line from the standard input.

Moreover there will more readability in the code. Look at the scanf statement given by you.

Any second person who sees it will be thoroughly confused. But whereas there will be more readeablity for fgets and it is easy to understand.

Kranthi Kumar
  • 1,184
  • 3
  • 13
  • 26
2

Well, IMHO this is due to the fact that scanf doesn't force you to limit the input size, but fgets does.

If you read the documentation and use scanf properly, there really isn't much difference here:

char line[256];

scanf("%255[^\n]%*c",line); // %*c to remove trailing \n
fgets(line, 256, stdin)

Notice that I removed the leading space from the scanf format string. I'll go back to this later.

Both cases ensure we don't read more than we can.

But, from a safety perspective, we need to think:

  • fgets force you to specify the size
  • On scanf you need to remember to set array_capacity - 1
    • fgets does this for you, so you actually pass in the capacity (or less)
  • "Man what does this format even mean??"

It's easy to forget scanf details, and when you deal with a vast team with different backgrounds in programming, some people might have more trouble to write code with such details and might not even understand the format string. So in general using fgets is safer.


Now, regarding the leading space I removed.

When you use fgets, you lose the ability to ignore whitespace characters before input, so I had to remove that space to make both calls have almost the same result.

I guess we can't really say one way is "better" than the other, only that fgets is more readable and will ensure that you remember to pass the size. Which is something you could also achieve by encapsulating the scanf call into an input reading function that builds a format string correctly. Which would also enable you to skip leading whitespace characters before reading.


EDIT Just so it's clear, my point here is that from a safety perspective (as in "not writing outside the character array"), both solutions are valid since you can limit the number of characters read on both of them.

The code I displayed is purely to show that you can have it limited to a certain size, not that they have exactly the same effect.
As Andrew Henle said, the %*c will indeed throw away one input character if the user provides more characters than the length we said to read.
But then again, that wasn't my point here and neither the question, IMHO. I just put it there to be a bit closer to what fgets does since fgets removes the \n from the buffer if the input isn't bigger than the amount you are trying to read.

The question asks about scanf vs fgets with no particular intent in mind, as far as I understood.
At least one wasn't described on the question.

There are a bunch of extra things you need to consider, depending on what the application need to do, of course.

Some considerations for fgets:

  • If your input has length size - 1, \n will be left in the buffer
    (which will mess up next character inputs...)
  • If your input has length < size, \n will be inserted on the string
    (which you will need to remove, since you probably won't need it)
  • If your input has length > size, remaining characters will be left on the buffer
    (which you will need to treat that somehow too)
  • If you change the array length, you need to update the size on the call
  • Can't ignore whitespace before extracting data

Some considerations for scanf:

  • You need to decide if you can do the %*c safely or not
    (Honestly, it's better to not use %*c, but just have a leading space on the format string when reading characters, as you already are using on the code on your question)
  • You need to remember to set capacity - 1 on the width specifier
  • If you change the array length, you need to update the width specifier
  • Can ignore whitespace before extracting data
  • Most people don't know format specifiers more complex than %s, %d and such, so it might be hard for other people on the team to understand the code
  • If the input length matches the width you specified, \n will be left on the buffer
  • If the input length is greater than width, remaining characters will be left on the input buffer too

Anyway, there's probably many more scenarios to consider, and many of them depend on the exact situation you need to deal with.

Both functions have their pros and cons. I don't think either of them is inherently "bad", both of them require the user to use them correctly and treat some errors themselves, but, fgets certainly has the advantage of forcing you to provide a length and being more readable.

I hope my point is clearer now.

  • 1
    What happens if you feed your `scanf()` a line longer than the buffer? That would discard a character that couldn't be recovered, wouldn't it? – Andrew Henle Oct 30 '19 at 12:22
  • Indeed. That's why I used "isn't much difference" (which means there is a bit of a difference) and "almost the same result". I'll add this information to my comment later so it's more explicit. But I throw back another question for you: what happens if you feed `fgets` with less characters than specified? It will insert the `\n` in the string. So both solutions have their pros and cons. But don't get me wrong, I'm also more favorable to using `fgets` I just feel that some people really don't know all the features scanf has. Some don't even know you can set a length limit, for example. – Leonardo Lourenço Oct 31 '19 at 15:57
  • With `scanf("%255[^\n]%*c",line);` and the _first_ character read is `'\n'`, nothing is read into `line`. `line` remains as is, possible lacking a null character. `'\n'` remains in `stdin`, possibly setting up and endless loop if this function called again. – chux - Reinstate Monica Mar 02 '20 at 23:34
1

Reading a line using scanf() not good?

The primary objection to scanf(some_format, buffer) is a lack buffer overflow protection as with

scanf(" %[^\n]",line);  // Bad - no buffer overflow protection.

An alternative could use

char buffer[100];
scanf(" %99[^\n]",line);  // a little better

Yet that can read multiple leading lines (if only made of white-space), drops leading white-space, lacks an easy way to handle non-constant buffer widths, and it leaves the rest of the line (maybe only '\n') in stdin. Does not return promptly if the first character was '\n'.


fgets() is better.

It still remains with problems though: lines longer than (example) 99 remain in stdin and reading embedded null chracters are hard to detect.

char buffer[100];
if (fgets(buffer, sizeof buffer, stdin)) {
  buffer[strcspn(buffer, "\n")] = '\0';  // Lop off potential tailing \n

Code could use fgetc() and crafted code to handle all cases as desired. Yet reading 1 character at a time incurs significant performance overhead.


Non standard C library getline() is fairly popular.

One downside: It does allow the user to cause code to consume excessive resources: An overlong line can allocate huge amounts of memory.

char *line = NULL;
size_t len = 0;
ssize_t nread;

while ((nread = getline(&line, &len, stdin)) != -1) {

The C standard library lacks a robust get-line function. The closest is fgets().

IMO, do not use scanf() until you know why it is bad.

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
-1

don't use fgets(...) instead use the following snippet:

char _x[7000];
    char* y;
while ( ! feof (_f) )
{
    fscanf(_f,"%[^\n]\n",_x);
            y=x;
}
mch
  • 15
  • 1