Casting signed to unsigned and vise versa while widening the byte count

Question

uint32_t a = -1;            // 11111111111111111111111111111111
int64_t b = (int64_t) a;    // 0000000000000000000000000000000011111111111111111111111111111111

int32_t c = -1;             // 11111111111111111111111111111111
int64_t d = (int64_t) c;    // 1111111111111111111111111111111111111111111111111111111111111111

From the observation above, it appears that only the original value's sign matters. I.e if the original 32 bit number is unsigned, casting it to a 64 bit value will add 0's to its left regardless of the destination value being signed or unsigned and;

if the original 32 bit number is signed and negative, casting it to a 64 bit value will add 1's to its left regardless of the destination value being signed or unsigned.

Is the above statement correct?

For the specific examples you show, all you're seeing is that the values are unchanged. In the first case, 4294967295 stays 4294967295, in the second -1 remains -1. — Thomas Jager, Feb 24 '20 at 01:30
yes i'm not talking about the value itself, i'm talking about its sign extension. — , Feb 24 '20 at 01:31
Since the value doesn't change, it's not about properties of C, but properties of unsigned and 2's complement representations. For an unsigned value, yes, zeros will be added to the front. For a signed value, yes, sign extension will occur. The fixed-width integers are very handy when the representation matters. What *exactly* do you mean by "Same happens the other way."? — Thomas Jager, Feb 24 '20 at 01:33
so when doing the sign extension, the original values types maters and not the destination correct? — , Feb 24 '20 at 01:38
No, that is not the case. For the specific example given, going from fixed-width 32-bit integer to an `int64_t` things behave as you describe. As soon as you leave the realm of fixed-width types, signed representations are implementation-defined. Also, the width of the types matters. You have to refer to the [Usual arithmetic conversions](https://en.cppreference.com/w/c/language/conversion) to know what happens to the values. From there, knowing the representation allows you to determine the binary result. — Thomas Jager, Feb 24 '20 at 01:42
You've also edited your question, completely changing what you're asking. "if the original 32 bit number is signed, casting it to a 64 bit value will add 1's to its left regardless of the destination value being signed or unsigned." This is completely incorrect. It only applies to fixed-width integers, and only to negative values. — Thomas Jager, Feb 24 '20 at 01:47
sorry I couldn't really understand what you mean. what are non fixed-width types? do you have an exmaple — , Feb 24 '20 at 01:47
The fixed-width types are things like `uint8_t`, `int32_t`, `uint64_t`, etc. Other types would be `int` `signed short` `unsigned long`. You have no way of knowing exactly how the bits of an `(int) -1` are arranged. — Thomas Jager, Feb 24 '20 at 02:01
(int) -1 has 32 `1`s doesn't it? i'm still confused by what you mean, what do you mean by `implementation-defined`? I'm only concerned with 2s complement if that helps. the rest seem to be dead. — , Feb 24 '20 at 02:03
It does in [Two's complement](https://en.wikipedia.org/wiki/Signed_number_representations#Two's_complement). There are however [many representations of signed numbers](https://en.wikipedia.org/wiki/Signed_number_representations). Refer to [this answer](https://stackoverflow.com/a/3952262/5567382). — Thomas Jager, Feb 24 '20 at 02:05
ok as a final answer, is my statement above valid for 2's compliment only then? I have added your `negative signed` correction. — , Feb 24 '20 at 02:07
@Thomas, I don't think the standard differs between the different representations of negatives, does it? It deals only with the result, which is why it says (effectively) "add MAXVAL + 1 to a negative value until it fits inside the target unsigned". For 2's complement, that's just reinterpreting as an unsigned value, it's likely to be slightly different for the other two representations, but similar outcome. I *could* be wrong, it certainly wouldn't be the first time :-) — paxdiablo, Feb 24 '20 at 02:19
@paxdiablo Going from a signed to an unsigned, I think what you're saying is correct. The issue is that Jeff has made a number of statements about the binary representation that are not always true. — Thomas Jager, Feb 24 '20 at 02:37
@ThomasJager is my statement still not true for 2's complement? below answer says its' true. — , Feb 24 '20 at 02:39
@Jeff Considering only two's complement, the statement "if the original 32 bit number is signed and negative, casting it to a 64 bit value will add 1's to its left regardless of the destination value being signed or unsigned." is correct. You should specify in your question this statement applies to 2's complement. As it is written, you're also including other representations. — Thomas Jager, Feb 24 '20 at 02:46
Also, the statement "I'm only concerned with 2s complement if that helps. the rest seem to be dead." is very dangerous. If you rely on the implementation for behavior, you're asking for bugs. (Ignoring cases where the code won't compile when not targeting the correct implementation.) — Thomas Jager, Feb 24 '20 at 02:47

score 2 · Accepted Answer · edited Jun 20 '20 at 09:12

Correct, it's the source operand that dictates this.

uint32_t a = -1;
int64_t b = (int64_t) a;

No sign extension happens here because the source value is an unsigned uint32_t. The basic idea of sign extension is to ensure the wider variable has the same value (including sign). Coming from an unsigned integer type, the value is positive, always. This is covered by the standards snippet /1 below.

Negative sign extension (in the sense that the top 1-bit in a two's complement value is copied to all the higher bits in the wider type^(a)) only happens when a signed type is extended in width, since only signed types can be negative.

If the original 32 bit number is signed and negative, casting it to a 64 bit value will add 1's to its left regardless of the destination value being signed or unsigned.

This is covered by the standards snippet /2 below. You still have to maintain the sign of the value when extending the bits but pushing a negative value (assuming the source was negative) into an unsigned variable will simply mathematically add the MAX_VAL + 1 to the value until it is within the range of the target type (in reality, for two's complement, no adding is done, it just interprets the same bit pattern in a different way).

Both these scenarios are covered in the standard, in this case C11 6.3.1.3 Signed and unsigned integers /1 and /2:

1/ When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.

2/ Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.

3/ Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

Note that your widening conversions are covered by the first two points above. I've included the third point for completion as it covers things like conversion from uint32_t to int32_t, or unsigned int to long where they have the same width (they both have a minimum range but there's no requirement that unsigned int be "thinner" than long).

^(a) This may be different in ones' complement or sign-magnitude representations but, since they're in the process of being removed, nobody really cares that much.

See:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0907r1.html (WG21, C++); and
http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2218.htm (WG14, C)

for more detail.

In any case, the fixed width types are two's complement so you don't have to worry about this aspect for your example code.

Thanks, doesn't my statement say the same thing?, i modified it a bit. — , Feb 24 '20 at 01:40
@Jeff, yes, I originally read your question as "how does this work?" rather than "does it work this way?", that's hopefully fixed now. The answer basically confirms what you suspect, referencing the standards section that controls it. — paxdiablo, Feb 24 '20 at 01:48
There is another comment under my question which says the statement in not correct and only applies to fixed-with data types. what is that about? — , Feb 24 '20 at 01:56
@Jeff, I believe the quote you are referring to is `signed representations are implementation-defined` - that's because two's complement is only *one* of the three possible representations for negative numbers (at the moment - changes are afoot to remove the other two). However, I don't think that affects sign extension *outcomes,* only the method by which it is acheived. — paxdiablo, Feb 24 '20 at 02:02
I see, yes i'm only concerned with 2's complement. I also had t modify my second statement a bit since `sign extending a signed number add's 1's to it's left only if its negative` — , Feb 24 '20 at 02:06

Casting signed to unsigned and vise versa while widening the byte count

1 Answers1