Is the compiler allowed to store a larger value in an 8-bit uninitialized variable?
I have an example code to illustrate the issue.
#include <stdio.h>
#include <inttypes.h>
int unhex(char *data, char *result, unsigned length)
{
uint8_t csum; /* in a working code it should be =0 */
for (unsigned i = 0; i < length; i++) {
if (!sscanf(data + 2*i, "%2hhx", result + i)) {
printf("This branch is never taken\n");
return 597;
}
csum += result[i];
}
return csum;
}
char a[] = "48454c4c4f";
char b[20];
int main() {
printf("%d\n", unhex(a,b,5));
printf("%s\n", b);
}
When running the above code under -Og
optimization, it reads:
597
HELLO
When it should clearly output something that is 8-bit, i.e. in [0, 256).
My GCC version is gcc (Gentoo 8.3.0-r1 p1.1) 8.3.0
and I run gcc -o stuff -Og -ggdb -Wall -Wextra -Wuninitialized -Wmaybe-uninitialized stuff.c
to compile it. And it does not warn about the possibly unininitialized variable. I know that this is UB, but is it really allowed even to such extent?
I understand that what goes on here is that GCC optimizes csum
out entirely, and treats return csum
just as if it was never there in the first place.
So there are two questions, actually:
- Is it that the compiler cannot inform the programmer of such an action (clearly no explicit assignment to csum throughout the code), or is it just a bug/caveat of GCC?
- Does the C standard permit such UB concerning undefined variables? (here de facto storing a value out of type range)
EDIT: the funny thing is that if +=
is replaced by =
, the compiler complains, just like it should, near the return line:
stuff.c: In function ‘unhex’:
stuff.c:14:10: warning: ‘csum’ may be used uninitialized in this function [-Wmaybe-uninitialized]
return csum;
^~~~