6

After Mark Lakata pointed out that the garbage isn't properly defined in my question I came up with this. I'll keep this updated to avoid confusions.


I am trying to get a function that I can call before a prompt for user input like printf("Enter your choice:); followed a scanf and be sure that only the things entered after the prompt would be scanned in by scanf as valid input.

As far as I can understand the function that is needed is something that flushes standard input completely. That is what I want. So for the purpose of this function the "garbage" is everything in user input i.e. the whole user input before that user prompt.


While using scanf() in C there is always the problem of extra input lying in the input buffer. So I was looking for a function that I call after every scanf call to remedy this problem. I used this, this, this and this to get these answers

//First approach
scanf("%*[^\n]\n");

//2ndapproach
scanf("%*[^\n]%*c");

//3rd approach
int c;
while((c = getchar()) != EOF) 
    if (c == '\n') 
        break;

All three are working as far as I could find by hit-and-trial and going by the references. But before using any of these in all of my codes I wanted to know whether any of these have any bugs?

EDIT:

Thanks to Mark Lakata for one bug in 3rd. I corrected it in the question.

EDIT2:

After Jerry Coffin answered I tested the 1st 2 approaches using this program in code:blocks IDE 12.11 using GNU GCC Compiler(Version not stated in the compiler settings).

#include<stdio.h>

int main()
{
    int x = 3; //Some arbitrary value
    //1st one
    scanf("%*[^\n]\n");
    scanf("%d", &x);
    printf("%d\n", x);

    x = 3;
    //2nd one
    scanf("%*[^\n]%*c");
    scanf("%d", &x);
    printf("%d", x);
}

I used the following 2 inputs

First Test Input (2 Newlines but no spaces in the middle of garbage input)

abhabdjasxd


23
bbhvdahdbkajdnalkalkd



46

For the first I got the following output by the printf statements

23
46

i.e. both codes worked properly.

Second Test input: (2 Newlines with spaces in the middle of garbage input)

hahasjbas asasadlk


23
manbdjas sadjadja a


46

For the second I got the following output by the printf statements

23
3

Hence I found that the second one won't be taking care of extra garbage input whitespaces. Hence, it isn't foolproof against garbage input.

I decided to try out a 3rd test case (garbage includes newline before and after the non-whitespace character)

``
hahasjbas asasadlk


23

manbdjas sadjadja a


46

The answer was

3
3

i.e. both failed in this test case.

Community
  • 1
  • 1
Aseem Bansal
  • 6,722
  • 13
  • 46
  • 84
  • Why is it tagged with optimization ? – 0x90 Jun 22 '13 at 07:38
  • @0x90 Because I wanted to know whether they are optimal and if not possible optimization. – Aseem Bansal Jun 22 '13 at 08:06
  • ok so read this http://stackoverflow.com/questions/5638999/effective-stdin-reading-c-programming – 0x90 Jun 22 '13 at 08:09
  • @0x90 The accepted answer is good when I am reading large inputs from files but when I am reading very small inputs like 1 or 2 integers wouldn't the approach in the linked question be less optimal due to calls to malloc and then parsing the strings into integers? – Aseem Bansal Jun 22 '13 at 08:22
  • you can use avoid malloc with declaring `char buf[256];` – 0x90 Jun 22 '13 at 08:26
  • fgets() or sscanf() ???? – 0decimal0 Jun 22 '13 at 09:27
  • Constructive criticism : If you want to downvote at least leave a comment to explain it – Aseem Bansal Jun 22 '13 at 16:55
  • 1
    Aseem, I think you are going about this problem incorrectly. What is "garbage" input? What are you trying to read? Are you just trying to skip over everything that is not numeric? – Mark Lakata Jul 03 '13 at 16:40
  • @MarkLakata I am trying to get a function that I can call before a prompt for user input like `printf("Enter your choice:);` followed a `scanf` and be sure that only the things entered after the prompt would be scanned in by `scanf` as valid input. As far as I can understand the function that is needed is something that flushes standard input completely. That is what I want. So for the purpose of this function the `"garbage"` is everything in user input i.e. the whole user input before that user prompt. – Aseem Bansal Jul 03 '13 at 16:58
  • @MarkLakata Am I wrong in the analysis of the problem or does this add clarity to the situation? – Aseem Bansal Jul 03 '13 at 16:59
  • You can't use standard C to do what you want to do. Standard C is not meant for interactive user input, where the user may type junk information before or after the program prompts for it. There are ways around it, but I don't think it is worth the effort. Most people do not use interactive console input for applications today anyways, and if they do, they are expected not to type garbage. – Mark Lakata Jul 03 '13 at 17:53
  • If you want to "flush the input stream", you need to write code that accesses the console directly, not just stdin (which is always buffered). There are packages such as `curses` that do that ... but as I said, that's is probably not what you want to do. – Mark Lakata Jul 03 '13 at 17:56

5 Answers5

9

The first two are subtly different: they both read and ignore all the characters up to a new-line. Then the first skips all consecutive white space so after it executes, the next character you read will be non-whitespace.

The second reads and ignores characters until it encounters a new-line then reads (and discards) exactly one more character.

The difference will show up if you have (for example) double-spaced text, like:

line 1

line 2

Let's assume you read to somewhere in the middle of line 1. If you then execute the first one, the next character you read in will be the 'l' on line 2. If you execute the second, the next character you read in will be the new-line between line 1 and line 2.

As for the third, if I were going to do this at all, I'd do something like:

int ch;
while ((ch=getchar()) != EOF && ch != '\n')
    ;

...and yes, this does work correctly -- && forces a sequence point, so its left operand is evaluated first. Then there's a sequence point. Then, if and only if the left operand evaluated to true, it evaluates its right operand.

As for performance differences: since you're dealing with I/O to start with, there's little reasonable question that all of these will always be I/O bound. Despite its apparent complexity, scanf (and company) are usually code that's been used and carefully optimized over years of use. In this case, the hand-rolled loop may be quite a bit slower (e.g., if the code for getchar doesn't get expanded inline) or it may be about the same speed. The only way it stands any chance of being significantly faster is if the person who wrote your standard library was incompetent.

As far maintainability: IMO, anybody who claims to know C should know the scan set conversion for scanf. This is neither new nor rocket science. Anybody who doesn't know it really isn't a competent C programmer.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • Your explanation helped. I couldn't understand either of them before this. I tried both of them. But for some reason your explanation didn't hold and both worked in test case 1 added into the question. Is that due to `scanf`'s internal working? Any advice regarding the 3rd test case? – Aseem Bansal Jul 03 '13 at 16:08
  • 1
    @AseemBansal: your tests are kind of broken because `%d` will skip leading white space, so with it, it makes no difference whether the leading whitespace was skipped already. Try using `%c` afterwards instead. – Jerry Coffin Jul 04 '13 at 01:54
  • @JerryCoffin you may want to reconsider your final paragraph; your first two paragraphs overlook the fact that if the garbage consisted of just a newline then `scanf("%*[^\n]%*c");` and `scanf("%*[^\n]\n");` stop after the first specifier because it failed to match anything. So neither solves the problem. (A fix would be to break each into two scanf calls). – M.M Sep 16 '14 at 21:11
  • @MattMcNabb: It looks to me like you're simply asking for something the question didn't ask (at least at the time--it's undergone substantial editing since). Ultimately, it's a question of what he really wants. It's certainly true that scanf will stop scanning as soon as any conversion fails. Whether he wants that to happen or wants to break it into two calls to read a character even if there's nothing preceding the new-line seems open to question. – Jerry Coffin Sep 16 '14 at 21:28
3

The first 2 examples use a feature of scanf that I didn't even know existed, and I'm sure a lot of other people didn't know. Being able to support a feature in the future is important. Even if it was a well known feature, it will be less efficient and harder to read the format string than your 3rd example.

The third example looks fine.

(edit history: I made a mistake saying that ANSI-C did not guarantee left-to-right evaluation of && and proposed a change. However, ANSI-C does guarantee left-to-right evaluation of &&. I'm not sure about K&R C, but I can't find any reference to it and no one uses it anyways...)

Mark Lakata
  • 19,989
  • 5
  • 106
  • 123
  • +1 for the bug you pointed out in the 3rd. Didn't see that. About 1st and 2nd, they don't use any additional variable while the 3rd does have repeated function calls. So shouldn't one of those be more efficient? – Aseem Bansal Jul 03 '13 at 05:59
  • Look at the internals of scanf and you will see that it is much less efficient. getchar() (depending on implementation) on the other hand is very simple and in some implementations could even be inlined in your code, making it super fast. – Mark Lakata Jul 03 '13 at 06:17
  • What do you mean by inlined? I don't know the term. Is [this question](http://stackoverflow.com/questions/132738/why-should-i-ever-use-inline-code) and [Inline expansion wikipedia page](http://en.wikipedia.org/wiki/Inline_expansion) refer to the same thing that you are referring to? – Aseem Bansal Jul 03 '13 at 07:00
  • Is [this](http://mail.python.org/pipermail/tutor/2004-August/031186.html) referring to the same inlining that you are refering to? This had simpler explanation. – Aseem Bansal Jul 03 '13 at 07:02
  • 1
    inline means it is written as a function call, but it is more like a #define macro. There is no overhead of calling the function, and the optimizer can also optimize register usage. C++ has the `inline` keyword to "force" inlining, but many C compilers will do it with heavy optimization automatically. The number one rule of programming though is to get a program that *works* first, then worry about optimization later. – Mark Lakata Jul 03 '13 at 16:26
  • The original code was `((c = getchar()) != '\n' && c != EOF)` . This *does* guarantee that the assignment to `c` is completed before the second operand is evaluated; the `&&` operator does *short circuit evaluation* and introduces a sequence point. This didn't change between ANSI (C89) and C90 . – M.M Sep 16 '14 at 21:18
  • I don't have my copy of K&R1 handy but my recollection is that `&&` always short-circuited since they introduced it . In any case, IMHO mangling code to support pre-C89 compilers is not practical. – M.M Sep 16 '14 at 21:19
1

Many other solutions have the problem that they cause the program to hang and wait for input when there is nothing left to flush. Waiting for EOF is wrong because you don't get that until the user closes the input completely!

On Linux, the following will do a non-blocking flush:

// flush any data from the internal buffers
fflush (stdin);

// read any data from the kernel buffers
char buffer[100];
while (-1 != recv (0, buffer, 100, MSG_DONTWAIT))
  {
  }

The Linux man page says that fflush on stdin is non-standard, but "Most other implementations behave the same as Linux."

The MSG_DONTWAIT flag is also non-standard (it causes recv to return immediately if there is no data to be delivered).

ams
  • 24,923
  • 4
  • 54
  • 75
-1

You should use getline/getchar:

#include <stdio.h>

int main()
{
  int bytes_read;
  int nbytes = 100;
  char *my_string;

  puts ("Please enter a line of text.");

  /* These 2 lines are the heart of the program. */
  my_string = (char *) malloc (nbytes + 1);
  bytes_read = getline (&my_string, &nbytes, stdin);

  if (bytes_read == -1)
    {
      puts ("ERROR!");
    }
  else
    {
      puts ("You typed:");
      puts (my_string);
    }

  return 0;
0x90
  • 39,472
  • 36
  • 165
  • 245
  • getline() is in Standard C++ not standard C. getchar() is in standard C. I have one of the methods using getchar() but don't understand how getchar() can be used to read int's float's etc. – Aseem Bansal Jun 22 '13 at 08:11
  • 1
    You should use `sscanf` read to a locally buffer and use `sscanf`, `strtol`, `strtof`. – 0x90 Jun 22 '13 at 08:19
  • The code is using, more or less, the POSIX [`getline()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/getline.html) function. The 'or less' is because the buffer size variable (`nbytes`) should be of type `size_t` rather than `int`. In general, you should free the buffer too. Here, it doesn't matter much, but if this was a function rather than the `main()` program, it would matter. – Jonathan Leffler Jan 20 '16 at 14:55
-1

I think if you see carefully at right hand side of this page you will see many questions similar to yours. You can use fflush() on windows.

Vicky
  • 168
  • 7
  • That isn't necessary. It depends on the compiler that I use on windows. I use Code:blocks IDE with gcc compiler. It doesn't allow flushing stdin by using `fflush()`. Also the question isn't specific to C on windows it is about standard C. – Aseem Bansal Jul 03 '13 at 06:43
  • What you might be referring to i.e. `fflush()` working for windows would be specific to Microsoft products if it is documented on MSDN i.e on Visual Studio or other Microsoft products. – Aseem Bansal Jul 03 '13 at 06:44
  • `fflush` is for clearing out file handle buffers (ie at the driver level or at the stdio buffering level), not for reading to the end of line of text. – Mark Lakata Jul 03 '13 at 16:28