1

This seems like it should be a simple thing but after hours of searching I've found nothing...

I've got a function that reads an input string from stdin and sanitizes it. The problem is that when I hit enter without typing anything in, it apparently just reads in some junk from the input buffer.

In the following examples, the prompt is "input?" and everything that occurs after it on the same line is what I type. The line following the prompt echoes what the function has read.

First, here is what happens when I type something in both times. In this case, the function works exactly as intended.

input? abcd
abcd
input? efgh
efgh

Second, here is what happens when I type something in the first time, but just hit enter the second time:

input? abcd
abcd
input?
cd

And here is what happens when I just hit enter both times:

input?
y
input?
y

It happens to return either 'y' or '@' every time when I run it anew. 'y' is particularly dangerous for obvious reasons.

Here is my code:

#include <stdio.h>
#include <stdlib.h>

#define STRLEN 128

int main() {
    char str[STRLEN];
    promptString("input?", str);
    printf("%s\n", str);
    promptString("input?", str);
    printf("%s\n", str);

    return EXIT_SUCCESS;
}

void promptString(const char* _prompt, char* _writeTo) {    
    printf("%s ", _prompt);
    fgets(_writeTo, STRLEN, stdin);
    cleanString(_writeTo);

    return;
}

void cleanString(char* _str) {
    char temp[STRLEN];
    int i = 0;
    int j = 0;

    while (_str[i] < 32 || _str[i] > 126) 
        i++;

    while (_str[i] > 31 && _str[i] < 127) {
        temp[j] = _str[i];
        i++;
        j++;
    }

    i = 0;
    while (i < j) {
        _str[i] = temp[i];
        i++;
    }
    _str[i] = '\0';

    return;
}

I've tried various methods (even the unsafe ones) of flushing the input buffer (fseek, rewind, fflush). None of it has fixed this.

How can I detect an empty input so that I can re-prompt, instead of this annoying and potentially dangerous behavior?

Victor Zamanian
  • 3,100
  • 24
  • 31
evenex_code
  • 843
  • 1
  • 8
  • 20
  • 2
    http://stackoverflow.com/questions/4023895/how-to-read-string-entered-by-user-in-c/4023921#4023921 might be of interest, despite the fact your specific problem is not with the input but rather the post-processing. Still, the answer in that link shows a more robust way of doing input. – paxdiablo Jun 20 '13 at 22:59

2 Answers2

4

This part of cleanString

 while (_str[i] < 32 || _str[i] > 126) 
      i++;

jumps over \0 when the string is empty.

You should add _str[i] != '\0' into the loop's condition.

To detect an empty string, simply check it's length just after the input:

do {
  printf("%s ", _prompt);
  fgets(_writeTo, STRLEN, stdin);
} while (strlen(_writeTo) < 2);

(comparing with two because of '\n' which fgets puts into the end of buffer)

nullptr
  • 11,008
  • 1
  • 23
  • 18
2

Why do you have a bunch of variable names with leading underscores? That's nasty.

Anyway, the first thing you must do is check the return value of fgets. If it returns NULL, you didn't get any input. (You can then test feof or ferror to find out why you didn't get input.)

Moving on to cleanString, you have a while loop that consumes a sequence of non-printable characters (and you could use isprint for that instead of magic numbers), followed by a while loop that consumes a sequence of printable characters. If the input string doesn't consist of a sequence of non-printables followed by a sequence of printables, you will either consume too much or not enough. Why not use a single loop?

while(str[i]) {
    if(isprint(str[i]))
        temp[j++] = str[i];
    ++i;
}

This is guaranteed to consume the whole string until the \0 terminator, and it can't keep going past the terminator, and it copies the "good" characters to temp. I assume that's what you wanted.

You don't even really need to use a temp buffer, you could just copy from str[i] to str[j], since j can never get ahead of i you'll never be overwriting anything that you haven't already processed.

  • The underscores are my own personal convention for input variables. It lets me differentiate them from automatic variables at a glance. – evenex_code Jun 20 '13 at 23:06
  • I've tried using NULL to check the return of fgets, but that didn't fix the problem. I didn't know about isprint() though, this is a nice solution. – evenex_code Jun 20 '13 at 23:07
  • The fact that you don't check the return value of `fgets` is a separate problem, which you hadn't noticed yet. Hit ctrl-D at the prompt and watch what happens –  Jun 20 '13 at 23:24
  • I don't understand. Ctrl+D just closed the terminal. – evenex_code Jun 21 '13 at 00:49
  • Is that what your program was *supposed* to do? –  Jun 21 '13 at 00:50
  • Oh, no, I see it now. It repeats the prompt (because I set cleanString to return a -1 if the input was blank, and made it the conditional of a while loop that contains the prompt.) – evenex_code Jun 21 '13 at 00:52
  • But on the second prompt, ctrl+D closes the program. So what is going on here? Where's the problem? – evenex_code Jun 21 '13 at 00:53
  • 1
    You are passing the buffer to your clean function even if fgets returned NULL. In that case the buffer contains garbage (uninitialized memory or leftover string from previous call) –  Jun 21 '13 at 00:57
  • Gotcha. Yeah that would be bad. I've changed it to do { err = fgets } while( err == NULL || cleanString() < 0) on the assumption that the first condition is checked first. – evenex_code Jun 21 '13 at 01:03
  • 1
    @evenex_code be careful with the underscores. I used to use them, then I switched to having them at the end because of possible clashes. I think these are the only cases where it is an issue, 'All identifiers beginning with an underscore are reserved for ordinary identifiers (functions, variables, typedefs, enumeration constants) with file scope', but I just switched instead so I never had to worry about it, and then after a while, I just stopped using them all together (except for include guards, e.g. `#ifndef MY_HEADER_H_`..., and that is only to prevent clashes with other code, and . . . – RastaJedi Sep 14 '16 at 17:47
  • . . . since rules determine the underscore can't be at the beginning there [because followed by capital letter], I was using at the end to be consistent), instead I just started using the `this` pointer when I needed to be explicit about which variable I was using. – RastaJedi Sep 14 '16 at 17:48
  • Yep, good advice. Learned this lesson the hard way myself (clashed with a 3rd party lib using the same scheme) some time after making this post. Was a pretty difficult bug to identify. And anyway, being more explicit about intent is certainly never a bad thing. – evenex_code Sep 15 '16 at 21:08