char to int conversion mechanism

Question

I am using array's of bytes and trying to understand better what happen during char to int conversion.

Here is the code :

#include <QDebug>

int main(int argc, char *argv[])
{
      Q_UNUSED(argc);
      Q_UNUSED(argv);

      char val = 0xff; // 1111 1111

      int a = val;

      unsigned int b = val;

      qInfo() << "size of int is " << sizeof(int);

      qInfo() << a << " " << b << " " << 0xff ;
      qInfo() << QString("0x%1 0x%2 0x%3").arg(a, 16, 16, QChar('0')).arg(b, 16, 16, QChar('0')).arg(0xff, 16, 16, QChar('0'));

      return 0;

}

and the output :

size of int is  4
-1   4294967295   255
"0xffffffffffffffff 0x00000000ffffffff 0x00000000000000ff"

As my char has the value of 1111 1111b, i expected - but i was wrong - that the conversion would fill my array of 4 bytes with 0x0 0x0 0x0 0xff ( let's say bytes are BigEndian oriented but it doesn't matter here) as it seem to be the case for numeric literals like 0xff.

I know that 0xff can be interpreted as 255 or -1 in decimal (unsigned or signed integer with the first bit being used for the sign), nevertheless i still don't really understand the mechanism that gives this output.

EDIT : As pointed out in comments, i would like to understand the mechanism that "extends" from one byte to 4 bytes the signed char value. I don't ask why the char type is signed or unsigned on my platform but how the 3 remaining bytes are built, what are the rules behind this output.

Thanks.

The problem is that the `char` type has implementation-defined signedness and can be either signed or unsigned depending on system. You can't really know which unless you dig in some specific compiler documentation. On your system it turned out to be signed and therefore the program doesn't behave like you expect. Therefore, never use the `char` type to store integer data. Use `uint8_t` instead. — Lundin, Dec 08 '17 at 14:20
@Lundin It's not the point because i knew that char was signed on my platform and using unsigned char gives expected result. But what i want is to understand the logic of the conversion mechanism when signed char are in use so i don't really see my question as a duplicate. Maybe i am wrong, you tell me. — Fryz, Dec 08 '17 at 14:41
Because `char` is signed and it is being converted to a larger type, it gets sign-extended rather than zero-extended. — Christian Gibbons, Dec 08 '17 at 14:52
@ChristianGibbons yes it's exactly what i am trying to understand. How the extension mechanism work actually for int and unsigned int conversion ? Maybe i should edit my question to be more specific and reopen it. — Fryz, Dec 08 '17 at 14:55
There's not much else to understand. The standard says "When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.". That's what happening here. The value of a signed char can be stored in a signed int. The value is preserved. You had -1 before and you get -1 after. — Lundin, Dec 08 '17 at 14:59
As I said, it sign-extends. If you want to know how sign-extension works, it's not hard to find. https://en.wikipedia.org/wiki/Sign_extension . Cliff's notes: pad bit values to the left with whatever the Most Significant Bit is. If the number is negative, the MSb will be 1, otherwise it will be 0. — Christian Gibbons, Dec 08 '17 at 15:07
Thanks for the link. It's what i was looking for. @Lundin I would appreciate to remove the "duplicate" tag on my question as i have edited it to make it more clear. Thanks anyway. — Fryz, Dec 08 '17 at 15:20
I can replace it with a different duplicate if it matters to you... https://stackoverflow.com/search?q=%5Bc%5D+sign+extension. Pick your favourite flavour of the 591 existing questions on the topic of "sign extension". — Lundin, Dec 08 '17 at 15:23

char to int conversion mechanism

0 Answers0