C bitwise AND gives different result with -O0 and -O2

Question

I'm working on a PC emulator using both Bochs and DOSBox as references.

When emulating the "NEG Ed" instruction (two's complement negation of a doubleword) I'm getting different results if I compile with -O0 rather than -O2.

This is a test program with just the relevant bits:

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdbool.h>

int main(int argc, const char **argv)
{
    uint32_t value = strtol(argv[1], NULL, 16);
    uint32_t negation = -(int32_t)(value);
    bool sign = negation & 0x80000000;

    printf("value=%X, negation=%X, sign=%X\n", value, negation, sign);
    
    return 0;
}

The -(int32_t)(value); part is taken from Bochs' NEG_EdM() function; for the equivalent operation DOSBox doesn't cast to signed int.

If you compile this program with GCC 10 using the -O2 option and use the hexadecimal value 0x80000000 as input, you'll get a wrong result for sign:

value=80000000, negation=80000000, sign=0

When compiled with -O0 the result is correct:

value=80000000, negation=80000000, sign=1

Is this undefined behaviour?

As far as I know casting to/from signed and unsigned integers is well defined, as is bitwise & on unsigned values.

Read more about UBs https://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior — 0___________, Jan 31 '21 at 11:01
This isn't valid C because you use bool, but without including stdbool.h. — Paul Hankin, Jan 31 '21 at 11:09
He had to include it as otherwise it will not copmile at all https://godbolt.org/z/e4ET66. He simply did not copy-paste the whole code — 0___________, Jan 31 '21 at 11:11
This is a good question, but the title should be modified to more accurately reflect it. The question is about UB, and that should explicitly appear in the title. Otherwise, the reader of the title will simply assume "that's UB" and not bother reading the question. — William Pursell, Jan 31 '21 at 11:14

Eric Postpischil · Accepted Answer · 2021-01-31T15:14:39.537

Source of Undefined Behavior

The key part of the problem is in the negation of -(int32_t)value.¹

At this point, value is 80000000₁₆ (2³¹). Since that is not representable in int32_t, the conversion is governed by C 2018 6.3.1.3 3, which says the behavior is implementation-defined. GCC 10.2 defines it to wrap modulo 2^N, where the destination width is N bits. Wrapping 80000000₁₆ to int32_t modulo 2³² produces −80000000₁₆.

Then the negation operator, -, is applied. The mathematical negation of −80000000₁₆ is of course 80000000₁₆, but this is not representable in int32_t.² The behavior is governed by C 2018 6.5 5:

If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.

Thus, the negation has undefined behavior. When -O0 is used, the compiler generates simple direct code. Godbolt shows it generates a negate instruction that wraps, producing the output 80000000₁₆ for the input bits 80000000₁₆ (which would represent −80000000₁₆ as a signed 32-bit integer). When -O2 is used, the compiler performs complicated analysis and transformation of the program, and the lack of defined behavior leaves the compiler free to produce any result. Indeed, Godbolt shows the negate instruction is absent. Effectively, the compiler “knows” that negating an int32_t value can never produce a result with the 2³¹ bit set in a program with defined behavior.

Discussion of Optimization

Consider that the range of values representable in int32_t is −2³¹ to 2³¹−1. The mathematical negations of these are −(2³¹−1) to 2³¹. However, 2³¹ overflows, resulting in an exceptional condition. The range of results that do not overflow is −(2³¹−1) to 2³¹−1. Therefore, in a program with defined behavior, only these results occur, and therefore the compiler may optimize as if only these results occur. In none of these results is the 2³¹ bit set. Therefore, in a program with defined behavior, negation & 0x80000000 is always zero, and the compiler may generate code based on this.

Fix

It appears you want a test for whether the sign bit would be set in an int32_t that is negated using two’s complement, that is, wrapping the result modulo 2³². For this, unsigned arithmetic can be used. If x is either an int32_t value or a uint32_t containing the bits that would represent such a value, then the sign bit of the negated value may be obtained with either of:

bool sign = - (uint32_t) x & 0x80000000u;
bool sign = - (uint32_t) x >> 31;

Footnote

¹ We deduce long is wider than 32 bits. Were it not, strtol("0x80000000", NULL, 16) would return LONG_MAX, per C 2018 7.22.1.4 8. That would be representable in uint32_t and int32_t, so value would be initialized to LONG_MAX, converting to int32_t would keep that value, negation would be −LONG_MAX, and sign would be zero in both the optimized and unoptimized versions of the program.

² If int32_t were narrower than int, the operand would be promoted to int before the negation, and the mathematical result would be representable. That is not the case with the GCC version and options you used, which we can deduce from the observed results.

I guess Bochs still works because of the way it computes the x86 SF flag. Probably the compiler doesn't have the same optimizazion opportunities. — barotto, Jan 31 '21 at 14:50

chqrlie · Answer 2 · 2021-01-31T13:31:36.413

There are some issues in your code:

the value returned by strtol("0x80000000", NULL, 16) depends on the range of type long: if type long has 32 bits, the returned value should be LONG_MAX, which is 2147483647, whereas if long is larger, 2147483648 will be returned. Converting these values to uint32_t does not change the value as both as within the range of uint32_t. Type long on your system seems to have 64 bits. You could avoid this implementation defined behavior using strtoul() instead of strtol().
there is no need for the intermediary cast to (int32_t): negating the unsigned value is well defined and -0x80000000 has the value 0x80000000 for type uint32_t.
furthermore, this conversion is counterproductive and the likely cause of the observed behavior as negating the value INT32_MIN has undefined behavior because of signed arithmetic overflow. With optimisations enabled, the compiler determines that you are extracting the sign as if by bool sign = -(int32_t)value < 0 and simplifies this expression as bool sign = (int32_t)value > 0, which is correct for all values except INT32_MIN for which the compiler considers any behavior to be OK, since the behavior is undefined anyway. You can check the code on Godbolt's Compiler Explorer.
you use type bool without including <stdbool.h>: the program should not compile. Is this a copy/paste error or do you compile as c++? The C99 _Bool semantics add an implicit test in the intialization statement, but it would be better to make it explicit and write:
```
bool sign = (negation & 0x80000000) != 0;
```
finally, you pass uint32_t values to printf for %X conversion specifiers. This is incorrect if the type int has fewer than 32 bits on your platform. Use the macros from <inttypes.h>.

Try this modified version:

#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <inttypes.h>

int main(int argc, const char **argv)
{
    uint32_t value = strtoul(argv[1], NULL, 16);
    uint32_t negation = -value;
    bool sign = (negation & 0x80000000) != 0;

    printf("value=%"PRIX32", negation=%"PRIX32", sign=%d\n", value, negation, sign);
    
    return 0;
}

Your unfortunate experience is rooted in the undefined behavior of signed arithmetic overflow. Compilers can take advantage of undefined behavior to implement advanced optimisations such as removing the end test in for (int i = 0; i > 0; i++) as well as more obvious yet non-trivial ones such as converting void f(int i) { int j = i * 2 / 2; ... to int j = i; which may exhibit different behavior for values exceeding 0x3fffffff.

Other languages (ie: java) try and remove undefined behavior and fully specify two's complement implementation and behavior, hence will not perform these optimisations.

The Standard C language committee seems to favor more optimisations at the cost of some surprises in border cases, which may be very difficult to spot and address. Your example is a perfect illustration of this problem.

Without `#include ` OPs code will not compile. So it is IMO copy-paste error https://godbolt.org/z/e4ET66 — 0___________, Jan 31 '21 at 11:13
Tagged `[C]`. If compiled as fortran then will not compile as well. But how is it related to the tagged language? — 0___________, Jan 31 '21 at 11:35
Converting an unsigned integer type to a signed integer type outside the destination range is not undefined behavior. It is implementation-defined per C 2018 6.3.1.3 3. — Eric Postpischil, Jan 31 '21 at 11:44
@EricPostpischil: indeed the conversion has implementation defined behavior, the real issue is this: negating `(int32_t)value` has undefined behavior for `value == 0x80000000` unless type `int` is larger than `int32_t`. I amended the answer with a detailed explanation. Your answer addresses this issue in even more detail :) — chqrlie, Jan 31 '21 at 14:23
The missing stdbool.h include was copy-paste error, fixed, thanks. — barotto, Jan 31 '21 at 14:39

C bitwise AND gives different result with -O0 and -O2

2 Answers2

Source of Undefined Behavior

Discussion of Optimization

Fix

Footnote