11

I have been told that scanf should not be used when user inputs a string. Instead, go for gets() by most of the experts and also the users on StackOverflow. I never asked it on StackOverflow why one should not use scanf over gets for strings. This is not the actual question but answer to this question is greatly appreciated.

Now coming to the actual question. I came across this type of code -

scanf("%[^\n]s",a); 

This reads a string until user inputs a new line character, considering the white spaces also as string.

Is there any problem if I use

scanf("%[^\n]s",a);

instead of gets?

Is gets more optimized than scanf function as it sounds, gets is purely dedicated to handle strings. Please let me know about this.

Update

This link helped me to understand it better.

Community
  • 1
  • 1
niko
  • 9,285
  • 27
  • 84
  • 131
  • 2
    `gets` isn't a very good idea either because of the risk of buffer overflows. Use `fgets`. – Etienne de Martel Nov 18 '11 at 04:28
  • @etiennedeMartel Thanks but to my knowledge fgets is for file handling right? I just learnt f indicates file mode so why the hell they used gets and fgets() – niko Nov 18 '11 at 04:29
  • 4
    You can read from standard input with `fgets` by passing `stdin` as the last parameter. The advantage of using `fgets` over `gets` is that with `fgets`, you can specify the length of your buffer, preventing `fgets` from reading _too_ much data. `gets` is a security risk, and they never fixed it for backward compatibility reasons. – Etienne de Martel Nov 18 '11 at 04:31
  • @niko - Any `f*` function (say, `fscanf`) can be used on the `stdin` filehandle to emulate the non-`f*` version (in this case, `fscanf(stdin, ...)` is exactly equivalent to `scanf(...)`). – Chris Lutz Nov 18 '11 at 04:32
  • @EtiennedeMartel you mean if the given size is 20 then when user input reaches 20 it automatically exits without even waiting for a new line character from user well let me try it then – niko Nov 18 '11 at 04:33
  • @ChrisLuta Thanks to you both – niko Nov 18 '11 at 04:34
  • 3
    @niko - No, reading from `stdin` will always wait for a newline. However, if the user enters more characters than `fgets` asked for, they are stored in a buffer as part of the `FILE *` structure, and not to be accessed by you. A second call to `fgets` will, instead of reading more user input, return more data from that buffer, until it is empty. (`fgetc` the other one-character-at-a-time functions do the exact same thing.) – Chris Lutz Nov 18 '11 at 04:36
  • Never use `gets()`. It cannot be used safely. It's even being removed from the next version of the C standard. – Keith Thompson Nov 18 '11 at 06:33
  • You should instead use `gets_s` *(new in C11 standard)* which allows you to safely read from standard input to a buffer without the hassle of stripping ending newline characters. – Anders Marzi Tornblad Jun 17 '16 at 09:31
  • @EtiennedeMartel The tutorial site says that gets() is removed from the C standard (https://www.programiz.com/c-programming/c-strings). Is that true? – Cyriac Antony Oct 23 '19 at 05:05
  • @CyriacAntony I don't know what this site is, and I have never encountered it before. That being said,`gets` was removed in C11, yes. Some implementations might still have it, however. The standard alternative since C11 is `gets_s`. – Etienne de Martel Oct 23 '19 at 17:48

2 Answers2

8

gets(3) is dangerous and should be avoided at all costs. I cannot envision a use where gets(3) is not a security flaw.

scanf(3)'s %s is also dangerous -- you must use the "field width" specifier to indicate the size of the buffer you have allocated. Without the field width, this routine is as dangerous as gets(3):

char name[64];
scanf("%63s", name);

The GNU C library provides the a modifier to %s that allocates the buffer for you. This non-portable extension is probably less difficult to use correctly:

   The GNU C library supports a nonstandard extension that
   causes the library to dynamically allocate a string of
   sufficient size for input strings for the %s and %a[range]
   conversion specifiers.  To make use of this feature, specify
   a as a length modifier (thus %as or %a[range]).  The caller
   must free(3) the returned string, as in the following
   example:

       char *p;
       int n;

       errno = 0;
       n = scanf("%a[a-z]", &p);
       if (n == 1) {
           printf("read: %s\n", p);
           free(p);
       } else if (errno != 0) {
           perror("scanf");
       } else {
           fprintf(stderr, "No matching characters\n"):
       }

   As shown in the above example, it is only necessary to call
   free(3) if the scanf() call successfully read a string.
sarnold
  • 102,305
  • 22
  • 181
  • 238
  • 2
    You can use a length modifier with either `%[` or `%s`. `%64[a-z]` reads up to 64 lowercase alphabetic characters, and `%64s` reads up to 64 non-whitespace characters. – Adam Rosenfield Nov 18 '11 at 06:01
  • @Adam: That'll teach me to write on an empty (and thus distracted :) stomach -- I read both `scanf(3)` and `scanf(3posix)` looking for _some_ sign that I was wrong.... but there it is: _The input string stops at white space **or at the maximum field width**, whichever occurs first._ Many thanks! – sarnold Nov 18 '11 at 07:26
  • 2
    `char name[64]; scanf("%64s", name);` shouldn't it be `scanf("%63s", name);`? which leaves a location for `NUL` character. – Nan Xiao Nov 06 '16 at 08:20
  • @NanXiao, wow. I can't believe I made that mistake. Excellent catch! – sarnold Nov 10 '16 at 20:27
  • 2
    But how is this answer even relevant to the question? The question is about implementing line-based input (and not skipping whitespace). Neither `%s` nor `a` modifier are even close to the topic. – AnT stands with Russia Nov 10 '16 at 21:12
5

Firstly, it is not clear what that s is doing in your format string. The %[^\n] part is a self-sufficient format specifier. It is not a modifier for %s format, as you seem to believe. This means that "%[^\n]s" format string will be interpreted by scanf as two independent format specifiers: %[^\n] followed by a lone s. This will direct scanf to read everything until \n is encountered (leaving \n unread), and then require that the next input character is s. This just doesn't make any sense. No input will match such self-contradictory format.

Secondly, what was apparently meant is scanf("%[^\n]", a). This is somewhat close to [no longer available] gets (or fgets), but it is not the same. scanf requires that each format specifiers matches at least one input character. scanf will fail and abort if it cannot match any input characters for the requested format specifier. This means that scanf("%[^\n]",a) is not capable of reading empty input lines, i.e. lines that contain \n character immediately. If you feed such a line into the above scanf, it will return 0 to indicate failure and leave a unchanged. That's very different from how typical line-based input functions work.

(This is a rather surprising and seemingly illogical properly of %[] format. Personally, I'd prefer %[] to be able to match empty sequences and produce empty strings, but that's not how standard scanf works.)

If you want to read the input in line-by-lane fashion, fgets is your best option.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765