This question is about what the C standard(s) require, not what the mainstream compilers can be reliably expected to do. Let's focus on C99, though input about how C89/ANSI or C11 diverge would also be welcome.
Section 6.2.6.2 introduces the notion of a "negative 0" for integer types, which would be 0x8000...
in a system using sign-and-magnitude representation of the integers, or 0xFFFF...
in a system using one's complement representation. (The standard also permits such systems to use those bit patterns for trap values instead of negative 0; and systems using the dominant/more usual two's complement representation have no negative 0.)
Further on in that section, the standard says:
If the implementation supports negative zeros, they shall be generated only by:
- the
&
,|
,^
,~
,<<
, and>>
operators with arguments that produce such a value- ...
It is unspecified whether these cases actually generate a negative zero or a normal zero, and whether a negative zero becomes a normal zero when stored in an object.
If the implementation does not support negative zeros, the behavior of
&
,|
,^
,~
,<<
, and>>
operators with arguments that would produce such a value is undefined.
So far as I can see, the standard doesn't further specify what is meant by "arguments that would produce such a value". But the natural way to read this says that any operation like x | y
where you would expect the result to be 0x8000...
or 0xFFFF...
in fact according the standard has undefined behavior.
Huh? Really? So according to the standard, even on a two's complement machine, ~0
really results in undefined behavior? That crazy. This can't be the intended interpretation.
Do they mean, perhaps: on systems which use sign-and-magnitude or one's complement representations, and don't have a negative zero, then the behavior of bitwise operations that would produce the bit pattern that would be a negative zero if those systems had one are undefined. And on systems with two's complement representations, the standard here remains silent?
If you look further ahead in the standard (section 6.5.7), it tells us that:
uint32_t x = 1;
int32_t y = 1;
x << -1; /* is undefined, so is y << -1 */
x << 32; /* is undefined, so is y << 32 */
y << 31; /* shifting 1 into sign bit is also undefined */
((int32_t) -1) << 1; /* is undefined */
((int32_t) -1) >> 1; /* is implementation-defined */
(Here I assume that the small integer literals which are typed as int
s aren't of a rank higher than int32_t
and uint32_t
.)
In these later sections, the standard doesn't say anything specific about when operations using &
, |
, ^
, or ~
would be undefined. (It only says there that "the usual arithmetic conversions" are performed on the operands. That could only result in undefined behavior if the conversion promoted to a signed type some out-of-range unsigned value, and if I understand the conversion rules rightly, that can never happen.)