2

I'm trying to read a string via scanf as follows:

char input[8];
scanf("%s",input);

It turns out that the program could read more than 8 characters. Say I inputed 123456789012345 and strlen(input) returns 15.

However when I set input as:

char input[4];
scanf("%s",input);

Inputing "12345" will cause '16146 segmentation fault'. Anyone knows how this happens?

klutt
  • 30,332
  • 17
  • 55
  • 95
Teng Long
  • 435
  • 6
  • 14
  • 1
    It happens because you are not using `"%7s"` or `"%3s"`, respectively, like you should. – Nominal Animal Aug 20 '18 at 03:14
  • 5
    When the string entered is longer than the space available to store it, you run into 'undefined behaviour'. Anything can happen. That includes crashing and appearing to work, and reformatting your disk drive. See [How to prevent `scanf()` causing a buffer overflow?](https://stackoverflow.com/questions/1621394). – Jonathan Leffler Aug 20 '18 at 03:20
  • You're invoking [*undefined behavior*](https://en.cppreference.com/w/cpp/language/ub). This means that literally, anything can happen, including a crash, or even time travel (yes, really). Undefined behavior also includes your program appearing to "work fine", but in reality, you're just getting lucky and your program could behave differently when you make even the smallest changes that normally shouldn't affect the program behavior. Just because your program appears to work doesn't mean it's well-formed, and you're not guaranteed to get an error when you write an ill-formed program. – eesiraed Aug 20 '18 at 04:34

2 Answers2

8

Technically both cases invoke undefined behavior. That the first case happens to work on your system should not be taken to mean that your program is well-defined. Testing can only indicate the presence of bugs, not their absence.

Since you're still learning C I will take the opportunity to offer you advice for reading input from stdin: always limit the length of input that will be read to the length of the buffer it's being read in to, reserving one spot at the end for the null-terminator.

If you want to use scanf to read strings from stdin, then it is safer to prefix the string format specifier with the maximum length of the string than to use a raw "%s". For example, if I had a char buffer[20]; that was the destination of a call to scanf, I would use the format string "%19s".

Govind Parmar
  • 20,656
  • 7
  • 53
  • 85
1

Both are so called undefined behavior and should be avoided at all costs. No bugs are so tricky to find as those caused by this.

So why does this work? Well, that's the problem with undefined behavior. It may work. You have no guarantees at all.

Read more about UB here: Undefined, unspecified and implementation-defined behavior

klutt
  • 30,332
  • 17
  • 55
  • 95