9

This question is about what the C standard(s) require, not what the mainstream compilers can be reliably expected to do. Let's focus on C99, though input about how C89/ANSI or C11 diverge would also be welcome.

Section 6.2.6.2 introduces the notion of a "negative 0" for integer types, which would be 0x8000... in a system using sign-and-magnitude representation of the integers, or 0xFFFF... in a system using one's complement representation. (The standard also permits such systems to use those bit patterns for trap values instead of negative 0; and systems using the dominant/more usual two's complement representation have no negative 0.)

Further on in that section, the standard says:

  1. If the implementation supports negative zeros, they shall be generated only by:

    • the &, |, ^, ~, <<, and >> operators with arguments that produce such a value
    • ...

    It is unspecified whether these cases actually generate a negative zero or a normal zero, and whether a negative zero becomes a normal zero when stored in an object.

  2. If the implementation does not support negative zeros, the behavior of &, |, ^, ~, <<, and >> operators with arguments that would produce such a value is undefined.

So far as I can see, the standard doesn't further specify what is meant by "arguments that would produce such a value". But the natural way to read this says that any operation like x | y where you would expect the result to be 0x8000... or 0xFFFF... in fact according the standard has undefined behavior.

Huh? Really? So according to the standard, even on a two's complement machine, ~0 really results in undefined behavior? That crazy. This can't be the intended interpretation.

Do they mean, perhaps: on systems which use sign-and-magnitude or one's complement representations, and don't have a negative zero, then the behavior of bitwise operations that would produce the bit pattern that would be a negative zero if those systems had one are undefined. And on systems with two's complement representations, the standard here remains silent?

If you look further ahead in the standard (section 6.5.7), it tells us that:

uint32_t x = 1;
int32_t y = 1;
x << -1; /* is undefined, so is y << -1 */
x << 32; /* is undefined, so is y << 32 */
y << 31; /* shifting 1 into sign bit is also undefined */
((int32_t) -1) << 1; /* is undefined */
((int32_t) -1) >> 1; /* is implementation-defined */

(Here I assume that the small integer literals which are typed as ints aren't of a rank higher than int32_t and uint32_t.)

In these later sections, the standard doesn't say anything specific about when operations using &, |, ^, or ~ would be undefined. (It only says there that "the usual arithmetic conversions" are performed on the operands. That could only result in undefined behavior if the conversion promoted to a signed type some out-of-range unsigned value, and if I understand the conversion rules rightly, that can never happen.)

dubiousjim
  • 4,722
  • 1
  • 36
  • 34
  • Here are [two](https://stackoverflow.com/questions/46387704) [other](https://stackoverflow.com/questions/11644362) questions with relevant discussion. – dubiousjim Apr 16 '18 at 15:32
  • I wonder why you choose to focus on C99, nostalgia ? (very good question) – Stargateur Apr 16 '18 at 15:33
  • 1
    do you really have this problem. Do you have any computer (or at least the access to it - there is a very limited number of a very old, or compatible with them younger machines like UNIVAC) which uses the ones complement or mark integer notation? – 0___________ Apr 16 '18 at 15:40
  • 1
    Note also that the bit pattern `1000...` on a two's compliment machine could also be a trap representation. That hasn't been the case on any system I've used, but it's possible. – dbush Apr 16 '18 at 15:54
  • 3
    It may not be intentional, but the net effect of this part of the spec is that you shouldn't be doing bit manipulation with signed numbers out of fear of UB. Use only unsigned types for bit manipulation. – Sergey Kalinichenko Apr 16 '18 at 16:01
  • @dbush, I just checked, wow you are right. Learn something new every day. – dubiousjim Apr 16 '18 at 16:08
  • @PeterJ_01, no I'm not coding with a funny architecture or compiler; but I want to know what the standard requires to keep track of if/when I'm not complying with it, or when I'm trusting that the next version of my compiler won't decide this bit of undefined behavior gives them an opportunity for some surprising optimization. I realize that if the usual behavior of these operations changed, thousands of programmers would scream. – dubiousjim Apr 16 '18 at 16:08
  • @dubiousjim nothing has changed - they just wrote down what was obvious for any sane programmer. Bitwise operations are safe on the unsigned integers. I do not see any changes which can affect properly written programs. Of course there are many discussions here where hairsplitter were proving something else. Of course one can bit manipulate the signed integer types - if he knows what the dangers are and knows what wants to archive – 0___________ Apr 16 '18 at 16:20
  • Between the two questions I linked to, both [46387704](https://stackoverflow.com/questions/46387704) which @4386427 also linked to and this is now proposed as a dupe for, and [11644362](https://stackoverflow.com/questions/11644362), I guess there's enough discussion of the issue. Can go ahead and close this one. But it should link to both of those discussions, not just the first. I got at least as much out of the second. – dubiousjim Apr 16 '18 at 22:44
  • @dubiousjim: I am unaware of any evidence that any production compiler conforming to the C99 or C11 standards has ever used anything other than two's-complement signed types, nor that any future compiler would be likely to do so. – supercat May 29 '18 at 20:18

0 Answers0