2

Let a be a variable of a signed integer type T and be U a corresponding unsigned type. The expression (U)a yields a value corresponding to the two's complement representation of the value of a as U. I want to know if the following is guaranteed by the C standard to undo that cast. Be u of type U and have the value of (U)a. Be MAX the maximum value the type T can hold. (Be aware of the implicit conversions to unsigned types and the fact that every positive value of a signed variable stays unchanged by these conversions.)

Firstly, suppose, T is able to hold the result:

T convert_2scomplement_to_T(U n) {
    return n<=MAX ? n : -(T)(U)-n;
}

Secondly, suppose, the function should detect such an invalid argument; be MIN the minimum value T can hold:

T convert_2scomplement_to_T_checked(U n) {
    if(n <= MAX) return n;
    if( !(n & (U)1 << sizeof(U)*CHAR_BIT-1) ) { // (*)
        // invalid argument, the value is positive and `T' cannot hold it
    }
    /* `n' represents something negative if we're here. */
    if(-n < MIN) {
        // invalid argument, the value is negative and `T' cannot hold it
    }
    return -(T)(U)-n;
}

The line marked with // (*) is not strictly conforming, as far as I can tell, because the standard don't make any guarantees about the position of the sign bit.

Do the described functions work as expected? And is the check for the sign bit avoidable in strictly conforming code?

(And besides the language-lawyer thing… it would be great, if someone who has written code knowing of at least one person having used it on a platform not using two's complement, could leave a comment, what machine this was. Wikipedia mentions

signed magnitude:

one's complement:

but this doesn't seem to be anything to worry about in programmes written today. Is there a reason the standard still addresses such machines?)

mafso
  • 5,433
  • 2
  • 19
  • 40
  • The question seems unclear; if you asking for something that only works on 2's complement machines? – M.M Jul 03 '14 at 01:45
  • " The expression (U)a yields a value corresponding to the two's complement representation of the value of a" - not true in general. If you are asking for the inverse of the operation `(U)a` in general, then just state that instead of bringing up 2's complement – M.M Jul 03 '14 at 01:45
  • @MattMcNabb: No, I'm asking for a strictly conforming solution. – mafso Jul 03 '14 at 01:45
  • You can use a union to type pun a signed and unsigned integer of the same size. C99 explicitly allows union type-punning. – Mysticial Jul 03 '14 at 01:47
  • @Mysticial: I don't want to type-pun. I want to undo the cast (they are equivalent on 2's-complement-machines, so I can't try my code because I don't have access to a non-2's-complement machine). – mafso Jul 03 '14 at 01:50
  • @MattMcNabb: Yes, it is true in general. The standard describes this as “[…] the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.“ (C99/C11 6.3.1.3 p.2), which is, effectively, 2's complement. – mafso Jul 03 '14 at 01:51
  • I disagree. What is the 2's complement representation of `-1`? You can't answer this without also being given the width of the type. Let's say `int` has 1 sign bit , 15 value bits, and 16 padding bits; and `unsigned int` has 32 value bits. Then the two's complement rep of `-1` would be `0xFFFF`, but `(U)a` is `0xFFFFFFFF`. So what you really mean is "the representation in 2's complement of the value of `a` but where the width to be used is the width of `U`". This seems an unnecessary diversion when you could just write `(U)a` which is completely unambiguous. – M.M Jul 03 '14 at 01:58
  • Is your question "How do I code `T func(U)` such that `t == func(t)` for all `t`?", or something else. – M.M Jul 03 '14 at 02:00
  • @MattMcNabb: Yes, that's exactly what I'm asking for. – mafso Jul 03 '14 at 02:01
  • @MattMcNabb: And yes, I didn't know that unsigned types also may have padding bits (somehow I thought, they wouldn't). If you think, you could reword the question to make it clearer it would be great if you would do so (you have enough rep, I think), I actually don't know where I was unclear. – mafso Jul 03 '14 at 02:03
  • You probably also need to include an assumption of no padding bits – M.M Jul 03 '14 at 02:05
  • Unsigned cannot have padding bits – M.M Jul 03 '14 at 02:05
  • @MattMcNabb: C99/C11 6.2.6.2 p.1, to me, seems to explicitly state the opposite: They may have padding bits (if it's not `unsigned char`). – mafso Jul 03 '14 at 02:07
  • @MattMcNabb: I don't want to add such an assumption, I want to stay strictly conforming. The whole thing is a language-lawyer question, I'll never write code supposed to be valid on a machine not using two's complement, I think. – mafso Jul 03 '14 at 02:09
  • @mafso OK. I thought other sections actually meant that unsigned padding bits couldn't conform, but I can't remember now what those other sections were – M.M Jul 03 '14 at 02:17
  • @MattMcNabb: That was, how I thought things were in the first place, it was actually your comment what made me looking into the standard about this and I came to the conclusion that padding is valid for (non-`char`) unsigned integer types. I, too, don't know if there are relevant parts of the standard giving more restrictions on unsigned representations… – mafso Jul 03 '14 at 02:21
  • 1
    If you are trying to convert from two's complement back to a signed number you may want to see the top answer (by hvd) of this question: http://stackoverflow.com/questions/13150449/efficient-unsigned-to-signed-cast-avoiding-implementation-defined-behavior The answer was in C++, but I think it also applies to C. – qbt937 Dec 28 '14 at 07:49
  • 1
    @qbt937; Thanks, didn't find that one. Voted to close my question as a duplicate. – mafso Dec 29 '14 at 11:01
  • 2
    @mafso You can try again to close it. – Artjom B. Apr 03 '15 at 09:47

2 Answers2

0

I don't think it's complicated. One's complement arithmetic is actually simpler, because every number is the the complement of its negated value. The only problem is dealing with all bits set because it translates as negative zero.

So yes, the cast S->U->S is definitely reversible, but U->S->U can fail for values with the high bit set. And you may not be able to undo it for negative zero because information can be lost.

Casting signed to unsigned preserves the bit pattern and is a no-op.

Unsigned to signed has to deal with numbers that have the high bit set. All bits on is legal and represents negative zero and is allowed to trap, be treated as zero or be forced to zero (at least that's my reading, but I don't have a Univac 1100 handy to try it on). Other values with the high bit set are outside the signed range so the behaviour is implementation defined. Most likely it will preserve the bit pattern, as the easiest thing to do.

If you're looking for guarantees you may not find them. Large parts of C/C++ are like that.

david.pfx
  • 10,520
  • 3
  • 30
  • 63
  • “Casting signed to unsigned preserves the bit pattern and is a no-op.”—No, it's only a no-op on machines using 2's complement. If not, this is a conversion to a 2's complement (ignoring padding bits). That is, if `n` is of type `int`, then `(unsigned)n` has a different value than `*(unsigned *)&n`, iff `n` is negative and we have a machine not using 2's complement. – mafso Jul 09 '14 at 01:25
  • Maybe my question was a little unclear: As an example (let's ignore padding bits for a moment), think of a program getting unsigned integers as input representing the 2's complement of a certain value and we want to store this _value_ in a corresponding signed integer object (I don't want to type-pun/reinterpret, I want to convert). – mafso Jul 09 '14 at 01:27
  • After doing some further research I have written an entirely new answer. Hope you like it. – david.pfx Jul 09 '14 at 11:22
0

Integral conversions for twos complement are defined as follows. n3936 S4.7/2,3 [n1570 s6.3.1.3 is similar]

If the destination type is unsigned, the resulting value is the least unsigned integer congruent to the source integer (modulo 2n where n is the number of bits used to represent the unsigned type). [ Note: In a two’s complement representation, this conversion is conceptual and there is no change in the bit pattern (if there is no truncation). —end note ]

Congruence is defined: (Maths) a relation between two numbers indicating that the numbers give the same remainder when divided by some given number.

Least means smallest or lowest in value. The 'some given value' is 2^number of bits.

So, the unsigned conversion evaluates (k mod 2^n) which is equal to (2^n - k).

  • with a four bit number, (signed)-1 => (unsigned)15
  • with a 16 bit number, (signed)-1 => (unsigned)65535
  • with a 32 bit number, (signed)-1 => (unsigned)4294967295

In each case this requires copying the bit pattern and incrementing the result.

If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

In each of these cases the (unsigned) value cannot be represented in the (signed) destination type, so you will have to check the implementation definitions. A thoughtful implementation might reverse the operation specified above, but the standard does not require this.

So the answer is that the standard does not guarantee round-tripping of signed to unsigned integer conversions for twos complement implementation.

david.pfx
  • 10,520
  • 3
  • 30
  • 63
  • Thanks a lot for spending your time on this! I'll look at it later this day (I don't have the time right now). – mafso Jul 09 '14 at 13:24