0

I stumbled upon this problem while saving an int in a char array and converting it back. I used bit shifts and logic or, but my result ended up with all the bytes after the least significant as 0xFF.

My question is: considering this example

#include <stdio.h>

int main() {
        char c1 = 0x86;
        unsigned char c2 = 0x86;
        unsigned int i1 = 0, i2 = 0;

        i1 = (unsigned int) c1;
        i2 = (unsigned int) c2;

        printf("%x-%x\n", i1, i2);
}

Why is the output ffffff86-86? Why does the char have all its upper bits set to 1?

I'm sure there is a very simple answer, but I couldn't come up with a specific enough query to find it on google.

Maldus
  • 10,548
  • 4
  • 24
  • 36
  • I didn't know about two's complement, and I though negative values where handled with a single bit. As I said, I had no idea how to put this concept into words and query it on google. You can either provide an answer or, if you have the power to, flag this question as invalid. Passive-aggressive comments achieve nothing but wasting your time. – Maldus Mar 14 '18 at 09:00
  • Yes there is a simple answer: enable compiler warnings. And then never use the `char` type for any other purpose than storing characters - see linked duplicates. Use `uint8_t` for raw binary data. – Lundin Mar 14 '18 at 09:05

2 Answers2

3

It's implementation defined if a char is signed or unsigned.

If char is signed, then when being promoted to an int it will be sign extended, so a negative value will keep its negative value after the promotion.

The leading 1 bits is how negative numbers are represented in two's complement systems, which is the most common way to handle negative numbers.


If, with your compiler, char is signed (which it seems to be) then the initialization of c1 should generate a warning. If it doesn't then you need to enable more warnings.

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
  • I was under the impression the negative value was expressed with a single bit as an int. Oh well, I doubt I would have found out on my own. Thanks! – Maldus Mar 14 '18 at 08:51
2

char c1 = 0x86; here c1 by default type is signed

c1 => 1000 0110
      |
      signed bit is set(1) 

When statement i1 = (unsigned int) c1; executes sign bit of c1 gets copied into remaining bytes of i1 as

   i1 =>                         1000 0110
                                 |
                                 |<--this sign bit 
   1111 1111 1111 1111 1111 1111 1000 0110 
     f   f    f    f    f    f     8    6

And i2 = (unsigned int) c2; here i2 and c2 both declared as unsigned type so in this case sign bit will not be copied into remaining bytes so it prints what's the data in 1st byte which is 0x86

Achal
  • 11,821
  • 2
  • 15
  • 37