I'm reading the C++ Primer Plus by Stephen Prata. He gives this example:
char dog[8] = { 'b', 'e', 'a', 'u', 'x', ' ', 'I', 'I'}; // not a string!
char cat[8] = {'f', 'a', 't', 'e', 's', 's', 'a', '\0'}; // a string!
with the comment that:
Both of these arrays are arrays of char, but only the second is a string.The null character plays a fundamental role in C-style strings. For example, C++ has many functions that handle strings, including those used by cout.They all work by processing a string character- by-character until they reach the null character. If you ask cout to display a nice string like cat in the preceding example, it displays the first seven characters, detects the null character, and stops. But if you are ungracious enough to tell cout to display the dog array from the preceding example, which is not a string, cout prints the eight letters in the array and then keeps marching through memory byte-by-byte, interpreting each byte as a character to print, until it reaches a null character. Because null characters, which really are bytes set to zero, tend to be common in memory, the damage is usually contained quickly; nonetheless, you should not treat nonstring character arrays as strings.
Now, if a declare my variables global, like this:
#include <iostream>
using namespace std;
char a[8] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'};
char b[8] = {'1', '2', '3', '4', '5', '6', '7', '8'};
int main(void)
{
cout << a << endl;
cout << b << endl;
return 0;
}
the output will be:
abcdefgh12345678
12345678
So, indeed, the cout "keeps marching through memory byte-by-byte" but only to the end of the second character array. The same thing happens with any combination of char array. I'm thinking that all the other addresses are initialized to 0 and that's why the cout stop. Is this true? If I do something like:
for (int i = 0; i < 100; ++i)
{
cout << *(&a + i) << endl;
}
I'm getting mostly empty space at output (like 95%, perhaps), but not everywhere.
If, however, i declare my char arrays a little bit shorter, like:
char a[3] = {'a', 'b', 'c'};
char b[3] = {'1', '2', '3'};
keeping all other things the same, I'm getting the following output:
abc
123
Now the cout doesn't even get past the first char array, not to mention the second. Why is this happening? I've checked the memory addresses and they are sequential, just like in the first scenario. For example,
cout << &a << endl;
cout << &b << endl;
gives
003B903C
003B9040
Why is the behavior different in this case? Why doesn't it read beyond the first char array?
And, lastly if I do declare my variables inside main, then I do get the behavior suggested by Prata, namely, a lot of junk gets printed before, somewhere a null character is reached.
I'm guessing that in the first case, the char array is declared on the heap and that this is initialized to 0 (but not everywhere, why?) and cout behaves differently based on the length of the char array (why?)
I'm using Visual Studio 2010 for these examples.