1

I have the following code in C:

char x, y; // Some random values
unsigned ux = (unsigned)x;
unsigned uy = (unsigned)y;

And I need to determine if the expression ux - uy == -(y - x) is always true or not, and to prove it or give a counter example.

I don't know if it's true or not, because they are different types of integers, with different sizes and two of them are signed and the other two are unsigned.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
Daniel
  • 229
  • 1
  • 8
  • 1
    @Cherubim I know that some compilers are different then others, so I need to know what should happen in this case, and not what actually some compilers do. – Daniel Apr 28 '20 at 08:50
  • 1
    @Cherubim You can't really prove anything with code here, since there's too much poorly-defined behavior. Save for how that program behaves on a certain computer, with a certain compiler, at a certain moment. – Lundin Apr 28 '20 at 08:51
  • @Daniel: you can accept one of the answers by clicking on the grey checkmark below its score. – chqrlie May 03 '20 at 23:03

2 Answers2

2

The question is more subtle than it looks. If this is from an interview, you would be welcome to elaborate on char signedness which is implementation defined, on the size of these types, also implementation defined, on integer promotion from char to int or unsigned int depending on the size and signedness of type char, on type conversions implied by binary operators such as ==, on the difference between signed and unsigned subtraction, on random values and undefined behavior related to undefined variables and signed arithmetic overflow. If you master these subjects, you may discover that the interviewer might not. Extreme caution is advised to adjust your demonstration accordingly.

  1. the code in the question has undefined behavior for a simple reason:

    char x, y; // Some random values
    unsigned ux = (unsigned)x;
    unsigned uy = (unsigned)y;
    

    x and y do not have random values, they are uninitialized. Computing (unsigned)x, (unsigned)y and y - x all invoke undefined behavior. So as coded, the equality does not make sense, it is undefined.

  2. Assuming you initialize x and y to random values, in the general case, the question is the expression ux - uy == -(y - x) is always true or not? The answer is NO and here is a counter-example:

    On an architecture where char type has the same size and signedness as int, the expression y - x may overflow, for example if y = INT_MIN and x = 1, and this overflow has undefined behavior. Hence for this arhitecture and these values, the expression ux - uy == -(y - x) is not true, it is undefined.

  3. On more common architectures where char has 8 bits and int at least 16 bits, let's study the different possibilities:

    • in y - x, x and y are promoted to int since this type can represent all values of type char, whether type char is signed or not. y - x has type int and is in the range [-255, 255] and -(y - x), also int is in the same range. On these architectures, -(y - x) has the same value as -y + x and x - y. The question becomes: is ux - uy == x - y?

    • The expression ux - uy == x - y compares an expression of type unsigned int with one of type int. The int expression is converted to unsigned int and the comparison is performed on the unsigned values.

    • Converting a signed int value to unsigned int is performed by adding the value UINT_MAX + 1 to negative value and preserving positive values. The effect of this operation is to compute the arithmetic value modulo UINT_MAX+1.

    • The expression is thus evaluated as:

      ((x mod (UINT_MAX+1)) - (y mod (UINT_MAX+1))) mod (UINT_MAX+1) == (x - y) mod (UINT_MAX+1)

    • the modulo operator is distributive and the values are such that x - y is the arithmetic difference of x and y, regardless of the signedness of char.

    Hence for architectures where sizeof(int) > 1, the answer is YES, the equality holds for all values of x and y.

chqrlie
  • 131,814
  • 10
  • 121
  • 189
0

No, there are plenty of cases where ux - uy == -(y - x) is not true.

  • char has implementation-defined signedness, so there is no portable way to tell if x and y contain positive or negative values to begin with.

  • x and y will get integer promoted to int regardless of their signedness, potentially sign extended.

  • The operands of == will get balanced according to "the usual arithmetic conversions", meaning you end up implicitly converting the right operand to an unsigned type regardless of what value it had originally.

    In theory this is done in an implementation-defined way, but in practice it will work the same on all real-world 2's complement systems.

Summary: this code is far to brittle to work reliably/portably, due to implementation-defined behavior and implicit promotions. Possibly there is also some underflow/overflow that could happen given certain input.

See these posts for details:
Is char signed or unsigned by default?
Implicit type promotion rules

The solution to avoid these kind of problems is to be more careful with the types picked. Avoid arithmetic on small integer types. Pay attention to variable ranges. Learn the various implicit type promotion rules and how they can cause problems with change of signedness etc.

Community
  • 1
  • 1
Lundin
  • 195,001
  • 40
  • 254
  • 396
  • 1
    While your arguments are true, they do not necessarily imply that `ux - uy == -(y - x)` is not true, and you do not give a counter example that would answer the question. – chqrlie Apr 28 '20 at 09:03
  • @chqrlieforyellowblockquotes There's very similar examples in the implicit type promotion FAQ I linked. – Lundin Apr 28 '20 at 09:13
  • The links in reference do document how implicit promotions and `char` signedness work and are implementation specific, and help understanding these concepts a C programmer should know. Yet your post does not answer the OP's question. – chqrlie Apr 28 '20 at 10:02
  • x, y could be promoted to unsigned int too on some crazy platform :D – Antti Haapala -- Слава Україні Apr 28 '20 at 10:16