3

I try to understand buffer overflows. This is my code:

#include <stdio.h>

int main() 
{
    char buf[5] = { 0 };
    char x = 'u';

    printf("Please enter your name: ");
    gets(buf);

    printf("Hello %s!", buf);

    return 0;
}

The buf array is of size five and initialized with 0es. So (with null termination) I have space for four characters. If I enter five characters (stack for example), I overwrite the null termination character and printf should print "Hello stacku!" because of the succeeding variable x. But this isn't the case. It simply prints "stack". Could someone please explain why?

schorsch312
  • 5,553
  • 5
  • 28
  • 57
lukasl1991
  • 241
  • 3
  • 11
  • 4
    Just because you've declared the variables directly after eachother, there's no guarantee that they end up that way in memory. Especially since `buf` will be a pointer somewhere to a memory address, which in this case is probably followed by zeroed memory. – fredrik Aug 24 '18 at 11:53
  • 1
    btw, the code you've provided doesn't even attempt to overwrite the null terminator. – fredrik Aug 24 '18 at 11:54
  • Generally, you should avoid `gets`: https://stackoverflow.com/questions/1694036/why-is-the-gets-function-so-dangerous-that-it-should-not-be-used – Bob__ Aug 24 '18 at 12:14
  • buffer overlows are easy - don't do it. – Martin James Aug 24 '18 at 12:15
  • 1
    @fredrik `buf` is no pointer. – Gerhardh Aug 24 '18 at 12:21
  • @Gerhardh it is, or what they're doing in [here](https://en.wikibooks.org/wiki/C_Programming/Pointers_and_arrays) would not work. the `char* pBuf = buf` construct specifically, or rather the assignment would work - but the use of pBuf would be undefined behaviour. – fredrik Aug 24 '18 at 12:34
  • @fredrik How is that related to this question? Here we only have `char buf[5] = { 0 };` without any pointer. And therefore `buf` will be an array which takes space on the stack and not somewhere else where a pointer might point to. – Gerhardh Aug 24 '18 at 12:39
  • @fredrik why don't you see an attempt to overwrite the null terminator? It only depends on the input length. The behaviour was different from my expectation because the stack grows downwards, not upwards and gets writes the input plus a terminating null – lukasl1991 Aug 24 '18 at 14:45
  • @Gerhardh @fredrik Is `buf` a pointer or not when I write `char buf[5] = { 0 };` ? – lukasl1991 Aug 24 '18 at 14:48
  • Pointers have `*` in the definition, while arrays use `[]`. No `*`? No pointer! End of story. This is an array that consumes memory for the full range of elements not just for one address. – Gerhardh Aug 24 '18 at 15:26
  • Things might get a bit more complicated when it comes to function parameters. But here we have a definition of a variable where the rules above apply. – Gerhardh Aug 24 '18 at 15:28

3 Answers3

12

The short explanation is, just because you declared 'x' on the source line after 'buf', that doesn't mean the compiler put them next to each other on the stack. With the code shown, 'x' isn't used at all, so it probably didn't get put anywhere. Even if you did use 'x' somehow (and it would have to be a way that prevents it being stuffed into a register), there's a good chance the compiler will sort it below 'buf' precisely so that it does not get overwritten by code overflowing 'buf'.

You can force this program to overwrite 'x' with a struct construct, e.g.

#include <stdio.h>

int main() 
{
    struct {
        char buf[5];
        char x[2];
    } S = { { 0 }, { 'u' } };

    printf("Please enter your name: ");
    gets(S.buf);

    printf("Hello %s!\n", S.buf);
    printf("S.x[0] = %02x\n", S.x[0]);

    return 0;
}

because the fields of a struct are always laid out in memory in the order they appear in the source code.1 In principle there could be padding between S.buf and S.x, but char must have an alignment requirement of 1, so the ABI probably doesn't require that.

But even if you do that, it won't print 'Hello stacku!', because gets always writes a terminating NUL. Watch:

$ ./a.out 
Please enter your name: stac
Hello stac!
S.x[0] = 75

$ ./a.out 
Please enter your name: stack
Hello stack!
S.x[0] = 00

$ ./a.out 
Please enter your name: stacks
Hello stacks!
S.x[0] = 73

See how it always prints the thing you typed, but x[0] does get overwritten, first with a NUL, and then with an 's'?

(Have you already read Smashing the Stack for Fun and Profit? You should.)


1 Footnote for pedants: if bit-fields are involved, the order of fields in memory becomes partially implementation-defined. But that's not important for purposes of this question.

zwol
  • 135,547
  • 38
  • 252
  • 361
  • Thanks for your detailed answer! I really thought that the memory layout will exactly be as I declared my variables... Why did you change x to an array? It also works when x is a simple variable!?! – lukasl1991 Aug 24 '18 at 13:53
  • I changed `x` to an array so I could have a second char there which would always remain NUL, as an extra backstop against `printf` printing garbage. Not strictly necessary since `gets` always writes a NUL, but in other buffer-overflow scenarios you don't have that. – zwol Aug 24 '18 at 14:00
  • And additionally thank you for pointing out that `gets` writes a terminating NUL! I am generally quite confused about when and when not I do have to pay attention on this NUL. As explained [here](https://stackoverflow.com/a/11229516/6382426), `char x[]="asdf";` is the only way to automatically get the terminating NUL, correct? What if I do sth like `char *foo = "bar";` or `char c[5] = { 'H', 'E', 'L', 'L', 'O' }?` – lukasl1991 Aug 24 '18 at 14:05
  • That's a completely different issue which I am sure is explained in detail elsewhere on this site, but the short version is you get a terminating NUL if you use a string literal _and_ you don't chop it off with the array length. `char *foo = "bar"` and `char foo[] = "bar"` and `char foo[4] = "bar"` all have a NUL, but `char foo[3] = "bar"` doesn't. `char c[5] = { 'H', 'E', 'L', 'L', 'O' }` doesn't have a terminating NUL and `char c[6] = { 'H', 'E', 'L', 'L', 'O' }` does, but in that case it's technically because of default initialization of array elements to 0. – zwol Aug 24 '18 at 14:34
  • Could you provide a reference for the longer version? I tried to find one but without success. – lukasl1991 Aug 24 '18 at 14:41
6

As the other answer pointed out, it's not at all guaranteed that x will sit immediately after buf in memory. But even if it did: gets is going to overwrite it. Remember: gets has no way of knowing how big the destination buffer is. (That's its fatal flaw.) It always writes the entire string it reads, plus the terminating \0. So if x happens to sit immediately after buf, then if you type a five-character string, printf is likely to print it correctly (as you saw), and if you were to inspect x's value afterwards:

printf("x = %d = %c\n", x, x);

it would likely show you that x was 0 now, not 'U'.

Here's how the memory might look initially:

     +---+---+---+---+---+
buf: |   |   |   |   |   |
     +---+---+---+---+---+

     +---+
  x: | U |
     +---+

So after you type "stack", it looks like this:

     +---+---+---+---+---+
buf: | s | t | a | c | k |
     +---+---+---+---+---+

     +---+
  x: |\0 |
     +---+

And if you were to type "elephant" it would look like this:

     +---+---+---+---+---+
buf: | e | l | e | p | h |
     +---+---+---+---+---+

     +---+
  x: | a | n   t  \0
     +---+

Needless to say, those three characters n, t, and \0 are likely to cause even more problems.

This is why people say not to use gets, ever. It cannot be used safely.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
2

Local variables are generally created on the stack. In most implementations, stacks grow downward, not upward, as memory is allocated. So, it is likely that buf is at a higher address than x. That's why, when buf overflows, it does not overwrite x.

You might be able to confirm this by writing buf[-1]='v';printf("%c\n",x); although that might be affected by padding. It may also be instructive to compare the addresses with printf("%i\n",buf - &x); -- if the result is positive, then buf is at a higher address than x.

But this is all highly implementation dependent, and can change based on various compiler options. As others have said, you shouldn't rely on any of this.

Tim Randall
  • 4,040
  • 1
  • 17
  • 39
  • Yes, you are right! I can reproduce exactly this behaviour and overwrite x with a value I want. Thank you! – lukasl1991 Aug 24 '18 at 14:27
  • But nevertheless, declaring more char arrays in different ways yield to different behaviour. Printing addresses is a good way to watch this – lukasl1991 Aug 24 '18 at 15:08