I'm learning C, coming from Python, and I'm trying to understand why printf() behaves this way.
My understanding is that a string in C is an array of characters followed by a null character.
If you declare your string like this:
char string[] = "I am a string";
printf("char: %lu bytes\n", sizeof(char));
printf("%s\n", string);
printf("%lu\n", sizeof(string));
The compiler will insert the null character at the end implicitly. Output:
char: 1 bytes
I am a string
14
Notice how a char is 1 byte in size, there are 13 characters in the string (10 letters and 3 spaces), and the size of the string is 14 bytes. The extra byte is the null character.
Or, we declare our string like this:
char string2[] = {'h', 'e', 'l', 'l', 'o', '\0'};
printf("%s\n", string2);
printf("%lu\n", sizeof(string2));
We manually inserted the null character into our array, and it behaves as expected when we print it or get its size:
hello
6
However, if we don't manually insert the null character, and try to print it:
char string3[] = {'w', 'o', 'r', 'l', 'd'};
printf("%s\n", string3);
printf("%lu\n", sizeof(string3));
we get (or at least, on my machine, I get)
worldhello
5
It prints string2 at the end. We can dig a little deeper to see why:
char *strAddr = string2;
char *str3Addr = string3;
char *lastElemStr3 = &string3[4];
printf("Address of hello string:\t\t\t%p\n", strAddr);
printf("Address of world string:\t\t\t%p\n", str3Addr);
printf("Address of string 3 last element + one byte:\t\t%p\n", lastElemStr3 + 1);
which gives
Address of hello string: 0x7ffee10768a4
Address of world string: 0x7ffee107689f
Address of string 3 last element + one byte: 0x7ffee10768a4
So printf is "overstepping" our array by one byte and printing what ever is in that memory cell - in this case, it's where the previous array happened to be stored. Why does it do that, only in the case that there's no null character at the end? What benefit does this have?