0

Given an array with 5 elements, it is well known that if you use scanf() to read in exactly 5 elements, then scanf() will fill the array and then clobber memory by putting a null character '\0' into the 6th element without generating an error(Im calling it a 6th element but I know its memory thats not part of the array) As is described here: Null termination of char array

However when you try to read in 6 elements or more an error is generated because the OS detects that memory is being clobbered and the kernel sends a signal. Can someone clear up why an error is not generated in the first case of memory clobbering above?

Example code:

// ex1.c
#include <stdio.h>
int main(void){
  char arr[5];
  scanf("%s", arr);
  printf("%s\n", arr);
  return 0;
}

Compile, run and enter four characters: 1234. This stores them in the array correctly and doesn't clobber memory. No error here.

$ ./ex1
1234
1234

Run again and enter five characters. This will clobber memory because scanf() stored an extra '\0' null character in memory after the 5th element. No error is generated.

$ ./ex1
12345
12345

Now enter six characters which we expect to clobber memory. The error that is generated looks like(ie. Im guessing) its the result of a signal sent by the kernel saying that we just clobbered the stack(local memory) somehow....Why is an error being generated for this memory clobbering but not for the previous one above?

$ ./ex1
123456
123456
*** stack smashing detected ***: ./ex1 terminated
Aborted (core dumped)

This seems to happen no matter what size I make the array.

Community
  • 1
  • 1
Jerry Marbas
  • 185
  • 7
  • Im guessing that scanf is compiled without stack smash protection while your compiler is compiling with it. Your standard C library may be built with different options to your code, Stack smash protection requires compiler support. – Vality Aug 01 '15 at 17:34
  • 1
    Probably the memory is aligned on even bytes. After the 5th byte there is another unused byte. – Paul Ogilvie Aug 01 '15 at 17:35
  • Yeah, alignment is my guess too... does it work if the array has size 4? – Adam D. Ruppe Aug 01 '15 at 17:36
  • Yeah so far Ive tried it with size 4 and a lot of other different sized arrays and the behaviour is the same. No error is generated if you enter the exact number of chars the array can hold. – Jerry Marbas Aug 01 '15 at 17:42
  • 2
    Close voting because OP has done a bad thing, invoked UB, knows it, but still wants the consequences explained. 'I poured gasoline over myself and lit a match. Why, and how, am I seriously burned?' – Martin James Aug 01 '15 at 17:42
  • 3
    I'm voting to close this question as off-topic because OP wants UB explained, (again). – Martin James Aug 01 '15 at 17:43
  • @jerry what I'd do myself now is to disassemble the compiled program and look at the code to see what is going on. (I can't do it myself since this is specific to an implementation, and I'm running a different version or something; my computer has no detection at all.) If it isn't alignment, my next guess would be the implementation adds an extra byte before triggering the error precisely because a lot of programs make this off-by-one mistake so they want to catch it without breaking those (already broken, but users expect them to work) apps. – Adam D. Ruppe Aug 01 '15 at 17:47
  • If you want to boil this down to it's root cause, get the C implementation's sources, the OS's sources, take a debugger and a good amount of time. – alk Aug 01 '15 at 17:49
  • Could you try compiling with `-fstack-protector-all` and run it again? When I do this on my system, gcc detects writing beyond the string even when I write one extra character. – Sergey Kalinichenko Aug 01 '15 at 17:51
  • Here is a [good reading on the subject](ftp://gcc.gnu.org/pub/gcc/summit/2003/Stackguard.pdf). – Sergey Kalinichenko Aug 01 '15 at 17:52
  • I recompiled with -fstack-protector-all and ran it again. I got the same results - ie. no error when I fill the array exactly. Thanks for your reply and the link. I have always wondered about this and will let you know what I find. – Jerry Marbas Aug 01 '15 at 18:11
  • Thanks for the suggestion @AdamD.Ruppe. I know assembler but have never disassembled anything before. I'm also going to write a loop that will fill arrays of large sizes using scanf with the exact amount of chars and see if all of them dont generate errors. Im using Ubuntu Studio 14.04 32bit/ gcc 4.8.4. I will let you know what I find. I think they will close this thread and I dont know even what they mean by UB.... – Jerry Marbas Aug 01 '15 at 18:16

4 Answers4

1

.Why is an error being generated for this memory clobbering but not for the previous one above?

Because for the 1st test it seemed to work just because of (bad) luck.

In both cases arr was accessed out-of-bounds and by doing so the code invoked undefined behaviour. This means the code might do what you expect or not or what ever, like booting the machine, formatting the disk ...

C does not test for memory access, but leaves this to the programmer. Who could have made the call to scanf() save by doing:

char arr[5];
scanf("%4s", arr); /* Stop scanning after 4th character. */
alk
  • 69,737
  • 10
  • 105
  • 255
  • Yes, it is undefined in the spec, but the question is why this implementation is doing it this way? Understanding the implementation defined behavior can be as important as understanding spec defined behavior. Sometimes, you target a particular implementation and depend on it, and sometimes it is just for educational purposes, but either way, I'm not satisfied leaving it at "it is undefined in the C spec, sorry". – Adam D. Ruppe Aug 01 '15 at 17:40
  • 1
    @AdamD.Ruppe: So we'd have as many answers (to this question) as we'd have implementations? – alk Aug 01 '15 at 17:41
  • He or she is asking about a specific implementation: "The error that is generated looks like(ie. Im guessing) its the result of a signal sent by the kernel saying that we just clobbered the stack(local memory) somehow....Why is an error being generated for this memory clobbering but not for the previous one above?" – Adam D. Ruppe Aug 01 '15 at 17:43
  • 1
    @AdamD.Ruppe: "*is asking about a specific implementation*" does s/he? – alk Aug 01 '15 at 17:46
  • 1
    I agree with alk. Stack smashing detection is not part of the language, and the question has absolutely no information about the implementation being used by the OP. So the only reasonable answer is, "It's UB, don't do that." – user3386109 Aug 01 '15 at 17:48
1

The behaviour is undefined if in both the cases where you input more than characters than the buffer can hold.

The stack smashing detection mechanism works by using canaries. When the canary value gets overwritten SIGABRT is generated. The reason why it doesn't get generated is probably because there's at least one extra byte of memory after the array (typically one-past-the-end of an object is required to be a valid pointer. But it can't be used to store to values -- legally). In essence, the canary wasn't overwritten when you input 1 extra char but it does get overwritten when you input 2 bytes for one reason or another, triggering SIGABRT.

If you have some other variables after arr such as:

#include <stdio.h>
int main(void){
  char arr[5];
  char var[128];
  scanf("%s", arr);
  printf("%s\n", arr);
  return 0;
}

Then the canary may not be overwritten when you input few more bytes as it might be simply overwriting var. Thus prolonging the buffer overflow detection by the compiler. This is a plausible explanation. But in any case, your program is invalid if it overruns buffer and you should not rely the stack smashing detection by the compiler to save you.

P.P
  • 117,907
  • 20
  • 175
  • 238
  • Thanks a lot! I have been wondering why arrays acted this way for years but never asked because its always taken for granted. Never heard of canaries before. You have probably cleared up a lot of future security flaws and misconceptions with this answer. Cheers. PS: I cant upvote your answer yet because I dont have enough reputation, but will as soon as I do. – Jerry Marbas Aug 01 '15 at 18:30
  • Glad to be of help! You might also like to read: [Anatomy of a Stack Smashing Attack and How GCC Prevents It](http://www.drdobbs.com/security/anatomy-of-a-stack-smashing-attack-and-h/240001832) and ["Strong" stack protection for GCC](https://lwn.net/Articles/584225/). – P.P Aug 01 '15 at 18:45
  • "*... probably because there's at least one extra byte of memory after the array (typically one-past-the-end of an object is required to be a valid pointer. But it can't be used to store to values -- legally)*" what please (referring the text in parenthesis)? – alk Aug 01 '15 at 18:56
  • @alk I was referring to the fact in `int a[128]; int *p =&a[0]+128;`, `p` is required to be a valid pointer even though it's outside the memory allocated for `a`. The same is true for `int i; int *p=&i+1;` too. – P.P Aug 01 '15 at 19:04
  • @BlueMoon: I see, but being "*a valid pointer*" which may *not* be dereferenced. So there is defintily *no* need to reserve some spare memory following the array's memory. – alk Aug 01 '15 at 19:07
  • Yes. It doesn't have to be a valid memory location. But it is a possibility where canary may not be present be there. – P.P Aug 01 '15 at 19:11
  • A possibe reason the OP's program does not catch `5` characters might be that canaries are place on well aligned addresses only. From this we could conclude the OP is using a 16bit implementation. On a recent 64bit system gcc (4.7.2) detects the out-of-bound access form the 9th character on. – alk Aug 01 '15 at 19:12
  • @BlueMoon: "*But it is a possibility where canary may not be present be there*" I cannot follow, do not understand this conclusion. – alk Aug 01 '15 at 19:14
  • It would be interessing to know which implemention you actualy are observing on what your question describes. – alk Aug 01 '15 at 19:16
  • If it is a valid location (for example, another object as in my example or just a reserved space for alignment) then canary value may not be overwritten and thus not raising SIGABRT. – P.P Aug 01 '15 at 19:17
  • I have a gcc 5.1.1, glibc 2.21 on a 32bit system. But then again, this is not predictable (as with all undefined behaviours) as with different optimizations memory layout might change. – P.P Aug 01 '15 at 19:20
0

Stack Smashing here is actually caused due to a protection mechanism used by compiler to detect buffer overflow errors.The compiler adds protection variables (known as canaries) which have known values.

In your case when an input string of size greater than 5 causes corruption of this variable resulting in SIGABRT to terminate the program.

You can read more about buffer overflow protection. But as @alk answered you are invoking Undefined Behavior

Amol Saindane
  • 1,568
  • 10
  • 19
0

Actually If we declare a array of size 5, then also rather we can put and access data from this array as memory beyond this array is empty and we can do the same till this memory is free but once it assigned to another program now even we are unable to acces a data present there