0

I missed a problem when I doing my homework.

Computer memory in the storage of data where there are big-endian storage and small-endian storage two methods, in order to detect a machine storage method, a student wrote the following procedure:

union NUM {
    int a;
    char b;
} num;

int main(){
    num.b = 0xff;
    if (num.a! = 0xff)
        printf ("bigend");
    else 
        printf ("smallend");
    return 0;
}

But he found the program running on the x86 machine, print out is actually 'bigend', which is clearly wrong. Do you know where the problem is? How should this program be modified?

I have asked my teacher, the topic is correct. I have found some information in some websites, but it makes me more confused. Why this question is not incorrect? And where the problem actually is?

张敏学
  • 3
  • 4

2 Answers2

1

Assuming an int (and therefore the union as a whole) is 4 bytes, writing to num.b only writes to one of those 4 bytes, leaving the rest uninitialized. Subsequently reading num.a reads those uninitialized bytes, invoking undefined behavior.

The bytes in the union must be set to all 0s so that the content is well defined.

#include <stdio.h>
#include <string.h>

union NUM {
    int a;
    char b;
} num;

int main(){
    // set all bytes of num to 0 first
    memset(&num, 0, sizeof(num));
    num.b = 0xff;
    if (num.a! = 0xff)
        printf ("bigend");
    else 
        printf ("smallend");
    return 0;
}
dbush
  • 205,898
  • 23
  • 218
  • 273
1

Type-punning is undefined behavior in C. When you read num.a you are violating the strict aliasing rules, so the compiler is allowed to generate code that may give back anything. Which it does.

To avoid this, you need to use memcpy():

int a = 0x00010203;
char bytes[sizeof(a)];
memcpy(bytes, &a, sizeof(a));
if(bytes[0] == 03) {
    printf("small endian\n");
} else ...
cmaster - reinstate monica
  • 38,891
  • 9
  • 62
  • 106