4

I'm creating a command line application with a prompt. It works fine as long as the input fits inside the buffer, but when the input is larger I get some weird behavior. Below is a simplified minimal example which has the same bug.

int main()
{
    for (;;) {
        const int size = 8;
        char str[size];

        printf("> ");
        fgets(str, size, stdin);

        toupper_str(str);
        printf("Result: %s\n", str);
    }
}

It works fine if the input is in smaller than size.

> asdf
Result: ASDF
> 

When the input is larger, the portion of the input in range is processed, and in the next loop iteration, the rest of the given input is immediately returned from fgets. This leads to that portion of the input also being processed and some weird output.

> asdfghjk
Result: ASDFGHJ
> Result: K

> 

I can see if the input is larger than or equal to size by comparing the last character to a newline. fgets retains the newline as long as it fits.

fgets(str, size, stdin);
if (str[strlen(str) - 1] != '\n') {
    fprintf(stderr, "Input too long\n");
}

When this is detected, how do I stop it from reading the rest of the too long input on the next iteration?

I've seen similar questions on here, but none that ask the same question.

jacwah
  • 2,727
  • 2
  • 21
  • 42
  • 2
    That is how `fgets` works. It does *not* discard the rest of the input, and your code should take that into account. If there is no `newline` at the end, the line was longer (except the last line of the file). Although if you missed the length by 1, the next input might contain *only* a `newline`. – Weather Vane Jun 12 '15 at 20:12
  • @WeatherVane The question is *how* to take it into account. It is obviously how `fgets` work. – jacwah Jun 12 '15 at 20:15
  • You have not said whether you want to discard the rest of the too-long input or consider it. Former: if `fgets` string does not terminate with a `newline` (before the `nul` of course) keep reading with `getchar` until `newline` or `EOF`. Latter: use allocated buffers and `realloc` if the input did not contain a final `newline`. Here in a different question is a way to use the technique. http://stackoverflow.com/questions/28254245/c-reading-a-text-file-separated-by-spaces-with-unbounded-word-size/28255082#28255082 – Weather Vane Jun 12 '15 at 20:27
  • I think your question is better stated as how to process *all* the input than find ways to dump input you cannot handle: that leads to GIGO. – Weather Vane Jun 12 '15 at 20:32
  • @WeatherVane As you can see in the provided code, the buffer is allocated on the stack and can as such not be `realloc`ed. – jacwah Jun 12 '15 at 20:35
  • @WeatherVane Sometimes you do not want a string over a certain length. But in most cases you are right, reading the whole input would be the correct way to solve the problem. – jacwah Jun 12 '15 at 20:38
  • if you do know the max size a line can have in the input file, you could allocate a proper buffer and read each whole line into it with fgets. Then you truncate the buffer to the relevant portion of the string (e.g. 7 chars) – Pynchia Jun 12 '15 at 21:39
  • @Pynchia Yes, but that would a) make code more complex than it needs to be and b) use more memory than needed. – jacwah Jun 12 '15 at 21:40
  • I think the code would be simpler and the buffer would be recycled on each line, but yes, life is often a trade-off between performance, cost, size, elegance, etc. :) – Pynchia Jun 12 '15 at 21:48
  • Hej, Anything more needed to answer this post? – chux - Reinstate Monica May 18 '17 at 14:48

2 Answers2

6

how do I stop it from reading the rest of the too long input on the next iteration?

Code needs to 1) detect if input is "too long" 2) consume the additional input.

fgets() will not overfill it buffer. If it does fill the buffer, the last char in the buffer is '\0'. So set that to non-'\0' before reading. Then code knows if the entire buffer was filled. Then check if the preceding char was a '\n'. If it is not an end-of-line, additional char may exist in stdin

char str[100];  // Insure buffer is at least size 2
for (;;) {
  str[sizeof str - 1] = `x`;
  if (fgets(str, size, stdin) == NULL) {
    // No more to read or IO error
    break;
  }
  int extra_data_found = 0;
  if (str[sizeof str - 1] == '\0' && str[sizeof str - 2] != '\n') {
    // Cope with potential extra data in `stdin`: read and toss
    int ch;
    while ((ch = fgetc(stdin)) != '\n' && ch != EOF) {
      extra_data_found = 1;
    }
  }
  // Use `str` as needed, noting if additional unsaved data found
  foo(str, extra_data_found);
}

Note: on a file error, fgets() returns NULL and the contents of str are undefined.

Note: Instead of str[sizeof str - 1] == '\0', code could use strlen(str) == sizeof str - 1. This gets fooled should fgets() read a null character '\0'.

Corner cases:
1. Typical str will be up to 98 char and then '\n' and '\0'. Is it OK is the last str is 99 char and then '\0'?
2. If #1 is OK, then may a typical str have 99 char and then '\0'?

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
3

When the input is too long, you need to read the remaining characters on stdin before continuing with the next iteration of the loop.

if (fgets(str, size, stdin) == NULL) {
    if (feof(stdin)) {
        return 0;
    else {
        perror("Could not read from stdin");
        exit(1);
    }
}
else if (strchr(str, '\n') == NULL) {
    int c;
    while((c = getc(stdin)) != '\n' && c != EOF);
    fprintf(stderr, "Input too long\n");
}

If you are on a POSIX system like OS X or Linux there is already the getline function that reads an arbitrary length newline terminated string from a stream. You can also find many free/open source versions of this function online.

jacwah
  • 2,727
  • 2
  • 21
  • 42
  • `if (!feof(stdin)) ... "Could not read from stdin"... ` is dubious. Commonly, by user control or through re-directed input, `stdin` will supply input for a while and then `feof(stdin)` will be true. It is not "Could not read from stdin", it is more like "no more to read". – chux - Reinstate Monica Jun 13 '15 at 18:19
  • @chux `fgets` can return `NULL` in two cases: if end of file is met or if an error occurred. If it returned `NULL` and it's not end of file, it must be an error, thus "Could not read from stdin". The code should handle the case where `feof(stdin)` is true too though, I will edit the answer. – jacwah Jun 13 '15 at 19:36
  • 1
    @jacwah Agree `fgets()` returns `NULL` in 2 cases, but are there more? Consider pathological case: http://stackoverflow.com/questions/23388620/is-fgets-returning-null-with-a-short-bufffer-compliant (Not that you need to change answer because of this) – chux - Reinstate Monica Jun 13 '15 at 19:56
  • @chux Good find! Did not think of this. – jacwah Jun 13 '15 at 20:20