1

In the example below:

int main(int argc, char *argv[])
{
    int16_t array1[] = {0xffff,0xffff,0xffff,0xffff};
    char array2[] = {0xff,0xff,0xff,0xff};
    printf("Char size: %d \nint16_t size: %d \n", sizeof(char), sizeof(int16_t));

    if (*array1 == *array2)
        printf("They are the same \n");
    if (array1[0] == array2[0])
        printf("They are the same \n");

    printf("%x \n", array1[0]);
    printf("%x \n", *array1);

    printf("%x \n", array2[0]);
    printf("%x \n", *array2);
}

Output:

Char size: 1 
int16_t size: 2 
They are the same 
They are the same 
ffffffff 
ffffffff 
ffffffff 
ffffffff

Why are the 32bit values printed for both char and int16_t and why can they be compared and are considered the same?

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
TheMeaningfulEngineer
  • 15,679
  • 27
  • 85
  • 143
  • 5
    The operands are being promoted to `int`s. Sign-extension of `0xff` and `0xffff` to 32 bits both yield `0xffffffff`. See sections _6.3.1.8 Usual arithmetic conversions_ and _6.3.1.1 Boolean, characters, and integers_ in [the C standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf). – Michael Jul 22 '15 at 12:05
  • 1
    Also, your assignments of things like `0xffff` to `int16_t`, and `0xff` to `char` if `char` is signed on your platform (it usually is) is implementation-defined. Regarding comparisons being the same, after the implementation=defined assignments are finished, you're left with *values* that are indeed the same (in your case, -1 all around). – WhozCraig Jul 22 '15 at 12:09
  • `sizeof` returns a `size_t`, not an `int`. So use the correct format specifier`%zu`. And `sizeof(char)` is _defined_ `1` by the standard. So you wil never get anything else. – too honest for this site Jul 22 '15 at 12:18
  • @Michael: `char` is only sign-extended if it is signed actually. That is implementation defined (but apparently true for OP's implementation). – too honest for this site Jul 22 '15 at 12:34
  • integral promotion is performed **because the compiler doesn't know the formal argument type** (here because it is varargs, also happens when a prototype is lacking). – Ben Voigt Jul 22 '15 at 22:24

3 Answers3

5

They're the same because they're all different representations of -1.

They print as 32 bits' worth of ff becaue you're on a 32-bit machine and you used %d and the default argument promotions took place (basically, everything smaller gets promoted to int). Try using %hx. (That'll probably get you ffff; I don't know of a way to get ff here other than by using unsigned char, or masking with & 0xff: printf("%x \n", array2[0] & 0xff) .)


Expanding on "They're the same because they're all different representations of -1":

int16_t is a signed 16-bit type. It can contain values in the range -32768 to +32767.
char is an 8-bit type, and on your machine it's evidently signed also. So it can contain values in the range -128 to +127.

0xff is decimal 255, a value which can't be represented in a signed char. If you assign 0xff to a signed char, that bit pattern ends up getting interpreted not as 255, but rather as -1. (Similarly, if you assigned 0xfe, that would be interpreted not as 254, but rather as -2.)

0xffff is decimal 65535, a value which can't be represented in an int16_t. If you assign 0xffff to a int16_t, that bit pattern ends up getting interpreted not as 65535, but rather as -1. (Similarly, if you assigned 0xfffe, that would be interpreted not as 65534, but rather as -2.)

So when you said

int16_t array1[] = {0xffff,0xffff,0xffff,0xffff};

it was basically just as if you'd said

int16_t array1[] = {-1,-1,-1,-1};

And when you said

char array2[] = {0xff,0xff,0xff,0xff};

it was just as if you'd said

char array2[] = {-1,-1,-1,-1};

So that's why *array1 == *array2, and array1[0] == array2[0].


Also, it's worth noting that all of this is very much because of the types of array1 and array2. If you instead said

uint16_t array3[] = {0xffff,0xffff,0xffff,0xffff};
unsigned char array4[] = {0xff,0xff,0xff,0xff};

You would see different values printed (ffff and ff), and the values from array3 and array4 would not compare the same.

Another answer stated that "there is no type information in C at runtime". That's true but misleading in this case. When the compiler generates code to manipulate values from array1, array2, array3, and array4, the code it generates (which of course is significant at runtime!) will be based on their types. In particular, when generating code to fetch values from array1 and array2 (but not array3 and array4), the compiler will use instructions which perform sign extension when assigning to objects of larger type (e.g. 32 bits). That's how 0xff and 0xffff got changed into 0xffffffff.

Steve Summit
  • 45,437
  • 7
  • 70
  • 103
  • `They're the same because they're all different representations of -1.` So regardless of the type i should observe the compared values as their decimal representation? – TheMeaningfulEngineer Jul 22 '15 at 12:59
  • @Alan: let me expand on that in the answer. – Steve Summit Jul 22 '15 at 22:10
  • Use `"%hhx"` to cause the value to be cast to `unsigned char` before printing. – chux - Reinstate Monica Jul 22 '15 at 22:18
  • "So it can contain values in the range -127 to +127." Hmmm, more likely -128 to +127. – chux - Reinstate Monica Jul 22 '15 at 22:18
  • @chux: For an 8-bit extended integral type, only -127 to +127 would be required (sign-magnitude representation is permitted). But for `char` and `signed char`, indeed every bit pattern is required to have a distinguishable value (no -0 vs +0 issue) in order to satisfy the roundtripping requirement, so yes it should be `-128` to `+127` (if `CHAR_BIT == 8`). – Ben Voigt Jul 22 '15 at 22:21
  • With exact width types, code should us matching format specifiers from ``: `printf("%" PRIx16 " \n", array1[0]);` – chux - Reinstate Monica Jul 22 '15 at 22:23
  • @chux: There's no way to pass anything smaller than `int` to `printf`, btw. – Ben Voigt Jul 22 '15 at 22:23
  • @Ben Voigt Hmmmm, On a theoretical machine with 4-byte `int` and 2-byte `char *`, what would happen with `printf("%s", "Hello")`? Maybe a wide integer embedded processor with limited memory? Is that case, something smaller than `int` is passed to `printf()`. – chux - Reinstate Monica Jul 22 '15 at 22:28
  • Steve Summit @Ben point is correct that a signed `char` would have a minimal range of -127 to 127. My point was what OP likely has. – chux - Reinstate Monica Jul 22 '15 at 22:29
  • @chux: Oh, yes, I know about 1's complement and sign/magnitude. But when I typed "-127", it was a typo, not a deliberate acknowledgement of the other representations, because if I was acknowledging the other representations, I would have said -32767 just above. – Steve Summit Jul 22 '15 at 22:32
  • @Steve Summit Subtle deduction: Minimum `int16_t` must be -32768 and not -32767. Since code has `int16_t`, it _must_ be 16-bit 2's complement capable (and no padding), and _certainly_ 2's complement for `char` be it 8 or 16 bit, padded or not, signed or unsigned. Things are more complex when code mixes base types with exact width ones - Sigh. – chux - Reinstate Monica Jul 22 '15 at 22:44
  • char cannot be padded and must have 2**CHAR_BIT unique values. No sign magnitude or ones complement allowed. – Ben Voigt Jul 22 '15 at 22:56
  • It should be clarified that the out-of-range assignment is *implementation-defined* and it's theoretically possible that some systems will store a different value than `-1` – M.M Jul 22 '15 at 23:28
1

Because there is no type information in C at runtime and by using a plain %x for printing, you tell printf that your pointer points to an unsigned int. Poor library function just trusts you ... see Length modifier in printf(3) for how to give printf the information it needs.

  • Does `no type information in C at runtime` imply that the compared elements (`(*array1 == *array2`) are also cast to some default type before comparison? – TheMeaningfulEngineer Jul 22 '15 at 12:22
  • 1
    Technically, that's not even a cast any more *at runtime* as there aren't any types, just bytes of data ;) But yes, from the compiler's view, comparing 2 integers of different sizes means converting the smaller one to the same size as the bigger one. –  Jul 22 '15 at 12:26
1

Using %x to print negative values causes undefined behaviour so you should not assume that there is anything sensible about what you are seeing.

The correct format specifier for char is %hhd, and for int16_t it is "%" PRId16. You will need #include <inttypes.h> to get the latter macro.

Because of the default argument promotions, it is also correct to use %d with char and int16_t 1. If you change your code to use %d instead of %x, it will no longer exhibit undefined behaviour, and the results will make sense.


1 The C standard doesn't actually say that, but it's assumed that that was the intent of the writers.

Community
  • 1
  • 1
M.M
  • 138,810
  • 21
  • 208
  • 365