-2

Self-teaching C is full of surprises. I do this short snippet to test strcat(), which supposedly appends the second parameter to the first one:

#include <stdio.h>
#include <string.h>

char s1[4] = "Foo ";
char s2[] = "Bar";

int main(void) {

    strcat(s1, s2);

    printf("%s %d %d \n", s1, strlen(s1), strlen(s2));
    return 0; 
}

I was expecting some overflow error since s1 is an array of 4 chars, but instead I got this:

Foo BarBar BarBar 10 6

I did this on Windows using MS Visual Studio Express 2013 (which by the way raise some alerts about using strcat). So... why did strcat duplicate the value of s2? That's not in the documentation.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • 2
    An overflow can cause many different behaviours. The language doesn't specify what happens after an overflow. It's undefined behaviour. Anything can happen! – jweyrich Aug 29 '14 at 13:46
  • you have to understand that C has a model as simple as possible. you have to adapt the world without exceptions first. – Jason Hu Aug 29 '14 at 13:48
  • 2
    the standard specifies that writing over the boundaries of an array is `undefined behaviour`. Also printf the result of strlen with `%d` is `undefined behaviour`. – mch Aug 29 '14 at 13:48
  • BTW: `printf("%s %d %d \n", s1, strlen(s1), strlen(s2));` should be `printf("%s %zu %zu\n", s1, strlen(s1), strlen(s2));` – chux - Reinstate Monica Aug 29 '14 at 14:13

3 Answers3

2

This

char s1[4] = "Foo ";

creates a character sequence s1 without a zero-terminator. That means that s1 is not a string and it is illegal to pass it to strcat. The behavior is undefined.

(In the above declaration with initialization you are using an obscure feature of C language, which allows zero-terminator to "fall off" the end of the char array being initialized. In C++ this initialization would be ill-formed, since the initializer string requires a buffer of size 5, not 4.)

In practice this causes strcat to run over the end of s1 array and into s2 (apparently accidentally located nearby in memory) looking for zero-terminator in the first argument. So in the end you are adding s2 to a combination of s1+s2 in memory, which creates the effect of s2 getting duplicated. Needless to say, the result is meaningless.

Of course, even with s1 of size 5 the code would still exhibit undefined behavior, since there's no room in the target buffer for the result of concatenation. Size of s1 must be at least 8 for the result to fit into it.

AnT stands with Russia
  • 312,472
  • 42
  • 525
  • 765
0

first: s1 is not null-terminated

second: you're allocating static memory for s1 and s2. the compiler usually packs those into a shared data-block of your executable. if you're reading over the boundary of s1 you're reading into this block, since your programme 'owns' this memory, the OS won't complain about it.

BeyelerStudios
  • 4,243
  • 19
  • 38
0

The first thing to note is that a C string is "null terminated" - that is it ends at a null byte ("\0") (and hence cannot contain multiple null bytes). So the string "Foo " is actually five characters long {'F', 'o', 'o', ' ', '\0'}.

You define a 4 character array and fill it with five elements. char s1[4] = "Foo "; This means the next array, s2, which is likely put next to the first one, will overwrite the null byte of s1.

Because C strings are defined up to the null byte, strcat will copy characters from the first string until it reaches the nullbyte. But, because s2 overwrote the null byte, the first null encountered by strcat is the null from s2. So from the point of view of strcat, s1 looks like "Foo Bar\0";

xen-0
  • 709
  • 4
  • 12