4
unsigned short s;
s = 0xffff;
int i = s;

How does the extension work here? 2 larger order bytes are added, but I'm confused whether 1's or 0's are extended there. This is probably platform dependent so let's focus on what Unix does. Would the two bigger order bytes of the int be filled with 1's or 0's, and why?

Basically, does the computer know that s is unsigned, and correctly assign 0's to the higher order bits of the int? So i is now 0x0000ffff? Or since ints are default signed in unix does it take the signed bit from s (a 1) and copy that to the higher order bytes?

Tony Stark
  • 24,588
  • 41
  • 96
  • 113

1 Answers1

11

No, an unsigned value is never sign-extended. Upcasting will always pad such a value with zeroes.

More precisely, the unsigned variable represents a particular number, and it will still represent the same number after a cast, provided that is possible in the new format.

Hexadecimal or no, C (although not C99) and C++ are designed to work in the absence of bits, eg with base-10 numerics.

Potatoswatter
  • 134,909
  • 25
  • 265
  • 421
  • 1
    +1 Conceptually an unsigned number can't be sign extended. It doesn't have a sign bit to extend from. (Edit: although C and C++ are tied to a binary representation of integers so I don't really buy your last paragraph.) – CB Bailey Apr 24 '10 at 23:40
  • I would have phrased that last sentence as "the C standards have always left such holes of undefinedness, partly in order to accommodate pre-existing implementations, that you could have a standard-conforming compiler that does not use bits". – Pascal Cuoq Apr 24 '10 at 23:44
  • @Pascal: If C++ is any indication, adding certainty to base-10 support (`numeric_limits<>::radix`) is the trend. – Potatoswatter Apr 24 '10 at 23:50
  • so c 'knows' that my variable s here is unsigned, and therefore pads with 0's to keep it the same number? – Tony Stark Apr 24 '10 at 23:50
  • 2
    @hator: it knows since you declared it that way… – Potatoswatter Apr 24 '10 at 23:53
  • @Pascal: But it must act as though it uses bits. _6.2.6.2 Integer types_ (under 6.2.6 Representation of types) describes everything in terms of bits and powers of two. – CB Bailey Apr 24 '10 at 23:55
  • @Charles: I didn't realize C99 was so restrictive. Well, in any case, I believe that BCD was a big deal before C99, so reasonable to say C *was* designed to work that way, and C++ isn't adding any such restrictions. – Potatoswatter Apr 25 '10 at 00:13
  • @Charles Indeed, even the existence of exactly one sign bit for signed quantities is explicitly mentioned. Thanks for drawing my attention to this. – Pascal Cuoq Apr 25 '10 at 00:18
  • It's the same (basically) for C++. I don't think that `numeric_limits<>::radix` is ever likely to return anything other than two for built-in integer types. – CB Bailey Apr 25 '10 at 00:22
  • @Charles C++ unsigned numbers are supposed to be base-2… but it's all academic as wacky machines just don't like playing by the rules. – Potatoswatter Apr 25 '10 at 00:29