6

I am trying to read in a variable length user input and perform some operation (like searching for a sub string within a string).

The issue is that I am not aware how large my strings (it is quite possible that the text can be 3000-4000 characters) can be.

I am attaching the sample code which I have tried and the output:

char t[],p[];
int main(int argc, char** argv) {
    fflush(stdin);
    printf(" enter a string\n");
    scanf("%s",t);

    printf(" enter a pattern\n");
    scanf("%s",p);

    int m=strlen(t);
    int n =strlen(p);
    printf(" text is %s %d  pattrn is %s %d \n",t,m,p,n);
    return (EXIT_SUCCESS);
}

and the output is :

enter a string
bhavya
enter a pattern
av
text is bav 3  pattrn is av 2
bhavs
  • 2,091
  • 8
  • 36
  • 66
  • 8
    Please note that using fflush on stdin (or any input stream) is undefined behavior in C . So it may cause your computer to halt and catch fire. ISO 9899:1999 7.19.5.2. – Lundin Oct 06 '11 at 11:27

4 Answers4

11

Please don't ever use unsafe things like scanf("%s") or my personal non-favourite, gets() - there's no way to prevent buffer overflows for things like that.

You can use a safer input method such as:

#include <stdio.h>
#include <string.h>

#define OK       0
#define NO_INPUT 1
#define TOO_LONG 2
static int getLine (char *prmpt, char *buff, size_t sz) {
    int ch, extra;

    // Get line with buffer overrun protection.
    if (prmpt != NULL) {
        printf ("%s", prmpt);
        fflush (stdout);
    }
    if (fgets (buff, sz, stdin) == NULL)
        return NO_INPUT;

    // If it was too long, there'll be no newline. In that case, we flush
    // to end of line so that excess doesn't affect the next call.
    if (buff[strlen(buff)-1] != '\n') {
        extra = 0;
        while (((ch = getchar()) != '\n') && (ch != EOF))
            extra = 1;
        return (extra == 1) ? TOO_LONG : OK;
    }

    // Otherwise remove newline and give string back to caller.
    buff[strlen(buff)-1] = '\0';
    return OK;
}

You can then set the maximum size and it will detect if too much data has been entered on the line, flushing the rest of the line as well so it doesn't affect your next input operation.

You can test it with something like:

// Test program for getLine().

int main (void) {
    int rc;
    char buff[10];

    rc = getLine ("Enter string> ", buff, sizeof(buff));
    if (rc == NO_INPUT) {
        // Extra NL since my system doesn't output that on EOF.
        printf ("\nNo input\n");
        return 1;
    }

    if (rc == TOO_LONG) {
        printf ("Input too long [%s]\n", buff);
        return 1;
    }

    printf ("OK [%s]\n", buff);

    return 0;
}
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • It is somewhat conceivable that `fgets` returns a buffer with no characters. If that happens, the code in `getLine` function will attempt to access `buffer[-1]` which is Undefined Behaviour. – pmg Oct 06 '11 at 10:17
  • @pmg, that _may_ happen if you do something silly like pass a buffer size indicating you want no characters but I'm not even sure of that. In the sane cases, you will either always have data or NULL will be returned (so that you don't check the buffer). This function has been tested with all the cases I could come up with (empty lines, end-of-files, larger-than-desired lines, shorter, exact size and so on), and with no problems. If you find an edge case where it doesn't work, let me know and I'll fix it, especially since it's used quite a bit in production code I've written :-) – paxdiablo Oct 06 '11 at 10:22
  • Well, you could easily avoid the risk _and_ make your function a bit faster by storing `strlen(buff)` in a local variable and checking that it's not zero before trying to access the last character. That way, you also won't need to call `strlen()` twice. – Ilmari Karonen Oct 06 '11 at 10:34
  • @pax thank you very much for this code, but what I am wondering is that there can be scenarios wherein I would like to read more than 5000 characters, will I still be able to use something like this ? – bhavs Oct 06 '11 at 12:17
  • 1
    @Bhavya, yes, you just have to create `buff` to be big enough (and possibly move it off the stack to either a global variable or allocated on the heap, if it's really big). – paxdiablo Oct 06 '11 at 12:39
2

In practice you shouldn't bother too much to be precise. Give yourself some slack to have some memory on the stack and operate on this. Once you want to pass the data further, you can use strdup(buffer) and have it on the heap. Know your limits. :-)

int main(int argc, char** argv) {
    char text[4096]; 
    char pattern[4096]; 
    fflush(stdin);
    printf(" enter a string\n");
    fgets(text, sizeof(text), stdin);

    printf(" enter a pattern\n");
    fgets(pattern, sizeof(pattern), stdin);

    int m=strlen(text);
    int n =strlen(pattern);
    printf(" text is %s %d  pattrn is %s %d \n",text,m,pattern,n);
    return (EXIT_SUCCESS);
}
RushPL
  • 4,732
  • 1
  • 34
  • 44
0

The main problem in your case is having char arrays of unknown size. Just specify the array size on declaration.

int main(int argc, char** argv) {
    int s1[4096], s2[4096];
    fflush(stdin);
    printf(" enter a string\n");
    scanf("%s", s1);

    printf(" enter a pattern\n");
    scanf("%s", s2);

    int m = strlen(s1);
    int n = strlen(s2);
    printf(" text is %s of length %d, pattern is %s of length %d \n", s1, m, s2, n);
    return (EXIT_SUCCESS);
}
Bogdan
  • 51
  • 4
0

Don't use scanf or gets for that matter because as you say, there is not real way of knowing just how long the input is going to be. Rather use fgets using stdin as the last parameter. fgets allows you to specify the maximum number of characters that should be read. You can always go back and read more if you need to.

scanf(%s) and gets read until they find a terminating character and may well exceed the length of your buffer causing some hard to fix problems.

doron
  • 27,972
  • 12
  • 65
  • 103
  • 4
    `scanf` can, with width specifier, be used safely: `char name[40]; if (scanf("%39s", name) != 1) /* error */;` – pmg Oct 06 '11 at 10:14