Note: I know that reading an uninitialized string is undefined behaviour. This question is specifically about the GCC implementation.
I am using GCC version 6.2.1 and I have observed that uninitialized strings of length greater than 100 or so are initialized to ""
. Reading an uninitialized string is undefined behaviour, so the compiler is free to set it to ""
if it wants to, and it seems that GCC is doing this when the string is long enough. Of course I would never rely on this behaviour in production code - I am just curious about where this behaviour comes from in GCC. If it's not in the GCC code somewhere then it's a very strange coincidence that it keeps happening.
If I write the following program
/* string_initialization.c */
#include <stdio.h>
int main()
{
char short_string[10];
char long_string[100];
char long_long_string[1000];
printf("%s\n", short_string);
printf("%s\n", long_string);
printf("%s\n", long_long_string);
return(0);
}
and compile and run it with GCC, I get:
$ ./string_initialization
�QE�
$
(sometimes the first string is empty as well). This suggests that if a string is long enough, then GCC will initialize it to ""
, but otherwise it will not always do so.
If I compile the following program with GCC and run it:
#include <stdio.h>
int main()
{
char long_string[100];
int i;
for (i = 0 ; i < 100 ; ++i)
{
printf("%d ", long_string[i]);
}
printf("\n");
return(0);
}
then I get
0 0 0 0 0 0 0 0 -1 -75 -16 0 0 0 0 0 -62 0 0 0 0 0 0 0 15 84 -42 -17 -4 127 0 0 14 84 -42 -17 -4 127 0 0 69 109 79 -50 46 127 0 0 1 0 0 0 0 0 0 0 -35 5 64 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -112 5 64 0 0 0 0 0 80 4 64 0 0 0 0 0 16 85 -42 -17
so just the start of the string is being initialized to 0
, not the whole thing.
I'd like to look into the GCC source code to see what the policy is, but I don't know that code base well enough to know where to look.
Background: My CS student turned in some work in which they declared a string to have length 1000 because "otherwise strange symbols are printed". You can probably guess why. I want to be able to give them a good answer as to why this was going on and why their "fix" worked.
Update: Thanks to those of you who gave useful answers. I've just found out that my computer prints out an empty string if the string is of length 1000, but garbage if the string is of length 960. See pts's answer for a good explanation. Of course, all this is completely system-dependent and is not part of GCC.