0

I'm trying to get multiple consecutive user inputs, which will later be printed. It works fine as long as you don't input a string longer than the boundary, but when you do input something longer it overflows into the next input. For example, this code:

printf("Enter student's first name:\t");
char nameF[20];
scanf("%20s", nameF);

printf("Enter student's last name:\t");
char nameL[20];
scanf("%20s", nameL);

printf("Enter student's ID Number:\t");
char id[9];
scanf("%9s", id);

printf("Enter student's e-mail:\t");
char mail[26];
scanf("%26s", mail);

Results in this:

Enter student's first name:     01234567890123456789012345
Enter student's last name:      Enter student's ID Number:      0123456789012345
Enter student's e-mail:

First:          01234567890123456789
Last:
ID Number:      012345678
E-mail:         9012345

I skipped over the print function for the sake of not having a wall of code. If you want to see that as well, let me know and I'll add it.

It should be noted that I tried fgets(), with very similar results. If I replace each scanf() line with fgets(var, sizeof(var), stdin);, I get this:

Enter student's first name:     01234567890123456789012345
Enter student's last name:      Enter student's ID Number:      0123456789012345
Enter student's e-mail:

First:          0123456789012345678
Last:           9012345

ID Number:      01234567
E-mail:         89012345

When I inserted getchar() after the scanf() statements, it ignored the overflow input, but still skipped over the next scan. I have tried throwaway input variables, and I've looked into other input methods, but couldn't seem to find anything to help. I'm sure this is a pretty simple fix for someone with experience, but I'm only a couple of weeks into learning C, so I don't know much beyond pointers and structs.

I can almost guarantee somebody will find a duplicate of this - they always do - but I did search around, for about a solid hour, and I didn't find anything that worked.

Thanks in advance for any help you can give me.

EDIT: I understand why the input is overflowing into the next one. It saves the first x characters, and the rest remains in the buffer and is entered into the next input scan.

I guess the more narrowed down question would be: How can I clear or divert the input buffer so that if the user inputs extra characters they won't remain in the buffer?

Jasper
  • 300
  • 2
  • 11
  • 2
    Apparently you're allowing free-form input. Therefore you should use a large generic input buffer. Only after you have validated the input string should the input be copied to the appropriate variable. – sawdust May 08 '14 at 19:33
  • possible duplicate of [C, flushing stdin](http://stackoverflow.com/questions/3876091/c-flushing-stdin) – Fred Foo May 08 '14 at 19:45
  • 1
    Although part of the problem here is covered by the proposed duplicate, the overlong `%20s` format specifications part of the problem is not covered by the proposed duplicate, so it is not appropriate to close this question as a duplicate of the other. – Jonathan Leffler May 08 '14 at 20:14
  • I looked at that post, but I don't understand enough about the `"%*[^\n]\n"` stuff to utilize it. – Jasper May 08 '14 at 20:23
  • 1
    The `%*[^\n]` part of the format is a scan-set with assignment suppression. It will read an arbitrary number of non-newlines. The final newline is a mistake. All white space in formats (blanks, tabs, newlines) maps to an optional sequence of white space followed by a non-white space. In this case, it reads characters and doesn't stop reading characters until it got something other than white space. This is diabolical for interactive input. The correct notation would be `%*[n]`, looking for a newline character in a scan-set and not assigning it. This is what I mean about `scanf()` being tricky! – Jonathan Leffler May 09 '14 at 06:12
  • Yeesh. I think I'll avoid that for now. – Jasper May 09 '14 at 06:18
  • The second scan-set should be `%*[\n]` — minor mistake. – Jonathan Leffler May 09 '14 at 06:20
  • Can you use variable names to tell the scans how much to read? For example, `%[var]s` would read a number of characters equal to the value of `var`. – Jasper May 09 '14 at 07:48
  • 1
    No. In `printf()`, you can use notations such as `%*.*s` with each `*` representing a number read from the argument list for a length. In `scanf()`, you can't do that; the `*` is for assignment suppression. If you read [The Practice of Programming](http://cm.bell-labs.com/cm/cs/tpop/) (an excellent book), Kernighan & Pike recommend creating the format string with the length you need on the fly (using `snprintf()` and a fair amount of care). – Jonathan Leffler May 09 '14 at 17:41
  • @JonathanLeffler [I knew that. Derp.] I just ask because I have NAME_LEN and other constants for input lengths, and if they change I have to go through and fix all of the format specifiers. It's annoying, but it doesn't look like `snprintf()` can be used to test formatting in the same way `sscanf()` can. – Jasper May 09 '14 at 17:48
  • 1
    It's a problem. Having code to generate the format strings for `scanf()` is probably as good as it gets. It depends on how widespread the format strings are and how often you think you might change lengths. The extra flexibility from format generation is useful, but makes the code harder to understand. You can try playing games with string concatenation and preprocessor tricks, but there are things that the preprocessor can't handle the C can. But this might make another question; we're running long on comments here. – Jonathan Leffler May 09 '14 at 17:54
  • Nah, I don't think it's worth it. It's just a toy program anyways. If I have another question I can't figure out I'll post it and let you know. Thanks a ton! – Jasper May 09 '14 at 18:03

3 Answers3

3

I believe scanf is using buffered input. It allows you to enter characters until you press enter. At that point, it takes the 20 that you specify, and the remaining buffer is then used as the input for the next scanf call. fgets's results are more predictable, with the next string being populated with the remnants of the previous buffer.

I cannot, however, explain why last name is blank, yet email is not in the first example.

Take a look here for an example of how to flush the type-ahead input. You can write a function like so:

void flushInput(void)
{
    int c;
    while((c = getchar()) != '\n' && c != EOF)
    /* discard */ ;
}

...and then call flushInput(); after each scanf().

As an additional note, I believe you should be specifying one character less in your scanf calls. It appears that....

char nameF[20];
scanf("%20s", nameF);

...is putting 20 characters into nameF and the '\0' terminator is going beyond the bounds of the array. fgets appears to properly place the terminator within the specified size.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
rjp
  • 1,760
  • 13
  • 15
  • Your last paragraph covers a very important point (see my answer), but I missed it when scanning your answer before writing my own. I think you should emphasize it more. – Jonathan Leffler May 08 '14 at 19:54
  • Thank you , the `flushInput()` was what I was looking for. Also, I fixed the scan range to allow for the `\0`. – Jasper May 08 '14 at 20:03
  • @JonathanLeffler I posted a new question that I think you could figure out pretty easily. http://stackoverflow.com/questions/23572319/validating-an-email-address-with-sscanf-format-specifiers – Jasper May 09 '14 at 19:23
2

There's an off-by-one problem:

char nameF[20];
scanf("%20s", nameF);

This is incorrect; the size in the format string doesn't count the trailing null, so you have to use 19 in the format with 20 in the variable definition, or 20 in the format and 21 in the variable definition:

char nameF[21];
scanf("%20s", nameF);

The original code means that you can easily have one variable clobbering another with its trailing null.

This is different from most other parts of C where you specify the whole length and the function reserves the last byte for the terminating null byte (for example: fgets()). In an ideal world, the inconsistency would be fixed, but it was there since the Standard I/O library was created in the late 70s and there was too much code that would be broken if C standard had changed the behaviour. So, you have to be aware of this quirk and work around it accordingly.

Also note that when it reaches the end of the format specified, scanf() leaves any left-over data on the line for the next input. If you want line-by-line input, read a line of data (fgets() or POSIX getline()), then scan with sscanf(). This is very often a better way of doing business.

char line[4096];

if (fgets(line, sizeof(line), stdin) == 0)
    ...handle EOF...
if (sscanf(line, "%19s", nameF) != 1)
    ...handle format error...

Note that you should error check every raw input functions (fgets()) for EOF, and sscanf() or scanf() for the correct number of conversions.

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Thanks, I fixed the off-by-one bug. Unfortunately, the `sscanf()` stuff is currently a bit beyond the scope of my C know-how. I just need to flush the rest of the input buffer, for which RJP gave me a simple solution. – Jasper May 08 '14 at 20:06
  • I'm not sure why you can't use `sscanf()` if you can use `scanf()`; they're very, very similar. Using `fgets()` avoids needing to flush the input buffer; it reads everything on the line before you use `sscanf()` on it (and can lead to better error reporting). However, what was proposed by RJP should work too. On some systems (Microsoft), you could use `fflush(stdin)` and probably get the extra data flushed. However, standard C says that operation is undefined, and the `getchar()` loop is good (it was done correctly). – Jonathan Leffler May 08 '14 at 20:11
  • `fflush(stdin)` was among the things I tried. I am working on Linux, so that didn't work. I'll look up `sscanf()`. I'm sure I can figure it out, it just seemed complex from my perspective. I think I'm largely just unused to using input functions directly as expressions. As I said, I'm still rather newb-ish. Thanks for the advice! – Jasper May 08 '14 at 20:15
  • The Linux documentation seems to suggest `fflush(stdin)` should work: _For input streams, fflush() discards any buffered data that has been fetched from the underlying file, but has not been consumed by the application._ Empirically, I've not managed to make it work; you found the same problem. This has been mentioned before; I don't know that there's a definitive question/answer to point you at. And I've not gone looking at the glibc source code to see why it does not behave as documented. – Jonathan Leffler May 08 '14 at 20:17
  • Huh. Well, that's strange. I would have no clue how to go about analyzing that, but it seems like it might be worth somebody's time to look into it. – Jasper May 08 '14 at 20:20
  • So, after much tinkering with input validation (as well as learning about `sscanf()` and several nifty string functions that I wish I had known about earlier), I found that your method works better, because it avoids overwriting adjacent array space. The only weakness would be someone pasting 4000+ chars into the input. :P – Jasper May 09 '14 at 05:46
  • Unfortunately, I did come across another problem that I expect you could answer with relative ease. Would you rather I started a new question, for clarity, instead of trying to ask it in the comments? – Jasper May 09 '14 at 05:47
  • 1
    `fgets()` handles lines longer than 4095 characters/bytes (when given 4096 bytes for the buffer) by terminating the input. You can tell this happened because the last character before the null is not a newline. So, you'd get slightly odd results, but it really serves them right for trying to break your program. – Jonathan Leffler May 09 '14 at 05:48
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/52352/discussion-between-jonathan-leffler-and-theqz) – Jonathan Leffler May 09 '14 at 05:49
0

It's not programs fault but I will tell you what is happening when range is crossed. . . Actually the characters which comes in range are stored in ID and others are covered to characters and stored to it's next variable which is E-mail. . Program is not telling to enter email because the value is overflowed from id and our is stored in email so email already contains value overflowed by ID . I hope it will help you

Adarsh Sojitra
  • 2,059
  • 1
  • 27
  • 50