Overflow in C function strcpy()

Question

I'm programming in C language in Linux envirioment and I'm a confused aboute why segmentation fault does not occur in this code:

int main(){
 char buffer[4];
 char tmp="qqqqqqqqqqqqqqqqqqqqqqqq";
 char *r;
 r=strcpy(buffer,tmp);
 return 0;}

I use variable tmp more longer than buffer and despite it i can stamp buffer variable right without any error.

Furthermore I don't undestrand why in this case:

int main(){
 static char buffer[4];
 int i=0;
 while(i<5){
    (*(buffer+i)='a');
      i++;}
 return 0;}

the segmentation fault occur only if I does not declare buffer static.

Thank you in advance.

UB doesn't mean you'll get a SegFault, see [this answer](http://stackoverflow.com/a/14382131/3426025) — BeyelerStudios, Oct 14 '16 at 09:51
`char tmp="qqqqqqqqqqqqqqqqqqqqqqqq";` is wrong: you're assigning a `const char *` to `char`, don't ignore compiler warnings — Elias Van Ootegem, Oct 14 '16 at 09:52
`while(i<5){` is also wrong, `buffer` is of type `char[4]`, which means valid indexes are 0 through 3. — Elias Van Ootegem, Oct 14 '16 at 09:53
There's no guarantee that you'll automatically generate a segfault if you overrun buffer; that's why buffer overrun attacks work :-(. The answer varies on platform a bit, but basically the type of variable (static or no, function-local, malloc()ed) changes where it gets stuffed in memory, which may be looser or tighter depending on hardware assist (e.g. NC), stack-smashing protection, and other factors. In short: don't count on memory protection to save you from all forms of stupid :-/ — BJ Black, Oct 14 '16 at 09:58
Does the first example even compile? Calling `strcpy(buffer, tmp)` should produce a compiler error since `tmp` is declared as a `char` and not a `char*`. — nonsensickle, Oct 14 '16 at 10:03

score 8 · Answer 1 · answered Oct 14 '16 at 10:06

In the first case, buffer is large enough to hold 4 chars, generally that means it can hold 3 characeters + 1 nul-char. strcpy does not allow you to protect against overflows, whereas strncpy does. It's a simple matter of writing:

const char *tmp = "your string"; // const char *, not char
char buffer[4];
strncpy(buffer, tmp, (sizeof buffer) - 1); // sizeof char array == number of characters buffer can store
buffer[3] = '\0';//add terminating nul char

In the second case, the biggest issue is that your while loop is accessing an index that is out of bounds (i<5 means the last iteration will have i == 4). Arrays are zero-indexed, so the last valid index for buffer is 3. Change the loop to:

while(i<3) {
    buffer[i++] = 'a';
}
buffer[i] = '\0';

You can do away with the nul char by initializing buffer correctly:

char buffer[4] = "";

So I'd probably write something like this:

int main ( void )
{
    const char *tmp = "some long string";
    char buffer[4] = "";
    strncpy(buffer, tmp, (sizeof buffer) - 1);
    return 0;
}

score 3 · Answer 2 · answered Oct 14 '16 at 09:53

C gives you the ability to shoot yourself in the foot.

It's your responsibility to ensure that the receiving buffer is large enough for the contents of the source string passed to strcpy. (Don't forget to allow space for the nul-terminator). That's why people who enjoy their current employ will use strncpy, which allows you to put an upper bound on the number of characters to be copied.

Currently the behaviour of your program is undefined. A segmentation fault is one possible manifestation.

score 1 · Answer 3 · edited May 23 '17 at 12:16

Since your first example shouldn't even compile I will assume that it is a simple typo in the question and work from the following code instead of your first example:

int main(){
 char buffer[4];
 char *tmp="qqqqqqqqqqqqqqqqqqqqqqqq"; // this should be a pointer
 char *r;
 r=strcpy(buffer,tmp);
 return 0;}

Everyone's comments about undefined behavior are correct, in that it doesn't have to result in a seg. fault, but I think that these comments are missing the point of your question. So I will skip past the obvious reasons you shouldn't write code like this and focus on why it seems to work (for lack of a better word) when you use the static keyword.

The static keyword semantically changes the life time of your buffer but lower down it also changes where in memory your buffer is stored.

You've probably heard of the heap and the stack and if you haven't you should probably read up about them as they are crucial concepts to programming in C, but there are more memory regions than just those two in a C program. The stack and heap are used for dynamic memory but static memory is stored in the data and bss segments of your program.

Overflowing a buffer in different memory regions has different effects on the behavior of your program which are entirely dependent on what is stored around your buffer's memory location.

Without the static keyword, in your first example, the buffer is placed on the stack and is surrounded by your functions local variables as well as other information, like what point in your code to execute after the function returns. It is important to understand the stack frame a.k.a. Call stack. Since I suspect your buffer overflow investigations are inspired by buffer overflow attacks I would recommend reading this explanation about how they work.

When you overrun the buffer on the stack one possible result is that your program tries to continue from the wrong point, which may not even be executable code. If your program tries to execute things that are not code then the operating system steps in and dumps you like a cheating husband/wife. But note that this is only one possibility.

However, by making the buffer static you are taking it out of the stack and putting it somewhere else that is pretty far away from executable code and is most likely surrounded by other data. When you overflow this buffer then you are corrupting data and not code so now the behavior of your program depends entirely on what data has been corrupted and if your program will do something bonkers because of it. It might just behave weird but won't crash or it might crash straight away depending on what was changed.

Undefined behavior is only undefined from the point of view of the standard. When you use undefined behavior in C, the behavior you get is dependent on your compiler and your machine which can do what ever they like. Computers are deterministic by nature and everything they do is defined by the code that they run so it is possible to define undefined behavior if you dig deep enough. But it's a lot safer to avoid it and it will make your code work everywhere instead of just on your machine on that obscure/obsolete compiler version...

Overflow in C function strcpy()

3 Answers3