I've seen several usage of fgets
(for example, here) that go like this:
char buff[7]="";
(...)
fgets(buff, sizeof(buff), stdin);
The interest being that, if I supply a long input like "aaaaaaaaaaa", fgets
will truncate it to "aaaaaa" here, because the 7th character will be used to store '\0'
.
However, when doing this:
int i=0;
for (i=0;i<7;i++)
{
buff[i]='a';
}
printf("%s\n",buff);
I will always get 7 'a'
s, and the program will not crash. But if I try to write 8 'a'
s, it will.
As I saw it later, the reason for this is that, at least on my system, when I allocate char buff[7]
(with or without =""
), the 8th byte (counting from 1, not from 0) gets set to 0. From what I guess, things are done like this precisely so that a for
loop with 7 writes, followed by a string formatted read, could succeed, whether the last character to be written was '\0'
or not, and thus avoiding the need for the programmer to set the last '\0' himself, when writing chars individually.
From this, it follows that in the case of
fgets(buff, sizeof(buff), stdin);
and then providing a too long input, the resulting buff
string will automatically have two '\0'
characters, one inside the array, and one right after it that was written by the system.
I have also observed that doing
fgets(buff,(sizeof(buff)+17),stdin);
will still work, and output a very long string, without crashing. From what I guessed, this is because fgets
will keep writing until sizeof(buff)+17
, and the last char to be written will precisely be a '\0'
, ensuring that any forthcoming string reading process would terminate properly (although the memory is messed up anyway).
But then, what about fgets(buff, (sizeof(buff)+1),stdin);
? this would use up all the space that was rightfully allocated in buff
, and then write a '\0'
right after it, thus overwriting...the '\0'
previously written by the system. In other words, yes, fgets
would go out of bounds, but it can be proven that when adding only one to the length of the write, the program will never crash.
So in the end, here comes the question: why does fgets
always terminates its write with a '\0'
, when another '\0'
, placed by the system right after the array, already exists? why not do like in the one by one for
-loop based write, that can access the whole of the array and write anything the programmer wants, without endangering anything?
Thank you very much for your answer!
EDIT: indeed, there is no proof possible, as long as I do not know whether this 8th '\0'
that mysteriously appears upon allocation of buff[7], is part of the C standard or not, specifically for string arrays. If not, then...it's just luck that it works :-)