3

According to the standard, operator << yields an Undefined Behavior for a negative signed first operand.

C++11 5.8.2

The value of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are zero-
filled. If E1 has an unsigned type, the value of the result is E1 × 2 pow E2,
reduced modulo one more than the maximum value representable in the result type.
Otherwise, if E1 has a signed type and non-negative value, and E1 × 2 pow E2 is
representable in the result type, then that is the resulting value; otherwise,
the behavior is undefined

This is understandable as the layout of integers in memory is implementation defined.

C++11 3.9.1.7

this International Standard permits 2’s complement, 1’s complement and
signed magnitude representations for integral types.

On the other hand, the standard do not seem to define exactly what bitwise & | and ^ should do.

C++11 5.11 Bitwise AND operator

and-expression:
    equality-expression
    and-expression & equality-expression
1 The usual arithmetic conversions are performed; the result is the bitwise 
AND function of the operands. The operator applies only to integral
or unscoped enumeration operands.

C++11 5.12 Bitwise exclusive OR operator

exclusive-or-expression:
    and-expression
    exclusive-or-expression ˆ and-expression
1 The usual arithmetic conversions are performed; the result is the bitwise
exclusive OR function of the operands. The operator applies only to integral
or unscoped enumeration operands.

C++11 5.13 Bitwise inclusive OR operator

inclusive-or-expression:
    exclusive-or-expression
    inclusive-or-expression | exclusive-or-expression
1 The usual arithmetic conversions are performed; the result is the bitwise
inclusive OR function of its operands. The operator applies only to integral
or unscoped enumeration operands.

The definition of those operators completely eludes me. Is it somewhere else in the standard ? Is the result implementation defined for signed integer ?

As an example let's look at this code:

signed char a=-1;
signed char b=3;
signed char c=a&b;

With 2's complement, a is 1111 1111, and b is 0000 0011. Finally c equals 0000 0011 (+3).

With 1's complement, a is 1111 1110, and b is 0000 0011. Does c equals 0000 0010 (+2) ?

With sign-magnitude, a is 1000 0001, and b is 0000 0011. Does c equals 0000 0001 (+1) ?

If you have access to platforms using 1's complement or sign-magnitude, what is the result on those platforms ?

Arnaud
  • 3,765
  • 3
  • 39
  • 69
  • 6
    I'm sure bitwise boolean operators only care about bitwise operations and don't concern themselves with signedness. – Tony The Lion Jan 10 '13 at 14:34
  • "... standard do not seem to define exactly what bitwise & | and ^ should do" - yes it does and you quoted that part. What's missing from the definition? – Mat Jan 10 '13 at 14:37
  • @ Tony: Then why do we have an UB in operator << ? The bits of the signed integer could be move as if the operand were an unsigned integer. – Arnaud Jan 10 '13 at 14:42
  • @ Mat: Defining bitwise and operator with "the result is the bitwise AND function of the operands" does not seem like a valid definition to me. – Arnaud Jan 10 '13 at 14:46
  • @Arnaud: Are you saying it should explicitly define the terms "bitwise" and "AND function", both of which are common knowledge? Or is there something else that you think is missing from the definition? – Mike Seymour Jan 10 '13 at 15:17
  • @ Mike Seymour: My problem is that the result is implementation defined and this is what I feel is missing from the standard. The consequence is that those operators should be avoided with signed types when writing portable code. – Arnaud Jan 10 '13 at 15:40
  • Regarding bitwise operations, I think the opposite view makes more sense: The interpretation of a bit pattern as an integer is implementation-dependent, while a bitwise operation on bit patterns is well-defined by the standard. – molbdnilo Jan 10 '13 at 16:35
  • So what non-ancient machines still use 1's complement or signed-magnitude? Last time I saw 1's complement was on a CDC 6600 circa 1980. Dealing with -0 was fun. – brian beuning Jan 10 '13 at 18:24
  • @brianbeuning: http://stackoverflow.com/a/12277974/204847 – Mike Seymour Jan 11 '13 at 03:51

4 Answers4

5

The bitwise operations operate on each bit independently, whatever each bit might happen to mean when interpreted as part of a numeric type.

So yes, 10000001 & 00000011 == 00000001, regardless of whether each bit represents a sign or part of a value.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • What if the operands are signed and the architecture admits trap representations? Can you get UB? – Kevin Feb 02 '16 at 06:17
0

The bitwise & | ^ operators just do their named operation of every bit on each of the two operands, whihc will be implementation specific depending on the representation of the underlying type.

However, things are different for shifting.

For example consider a one byte twos-complement -1 = 11111111. Then you shift right one. Now is your number 127 (changing sign) or -1 (shifting 1 into the highest bit rather than 0). The same thing applies if it's a sign-magnitude representation. To avoid all of those sorts of problems the standard simply prohibits it.

Mark B
  • 95,107
  • 10
  • 109
  • 188
  • Right shifting is implementation defined and it is clearly stated in 5.8.3. But there is no mention of that with bitwise operators. – Arnaud Jan 10 '13 at 15:00
0

Left-shift and right-shift on negative signed integers are treated by the standard the way they are in order to allow implementations to use "arithmetic shift" machine instructions. An arithmetic right shift replicates the sign bit, unlike a logical right shift which inserts 0s on the left. An arithmetic left shift may produce an overflow exception on some architectures if a bit shifted off of the left-hand edge differs from the sign bit. Consequently, right-shift is implementation-defined (because the result is always valid but may differ depending on implementation) while left-shift is undefined (because the result might be an interrupt.)

The bit patterns produced by bitwise logical operators are fully specified, but in the case of signed integers, it is possible that the result is a trap value (such as -0 on a 1's-complement or sign-magnitude architecture where -0 is not valid). In this case, result is undefined behaviour, according to paragraph 4 of the introduction to section 5:

If during the evaluation of an expression, the result is not mathematically defined or not in the range of representable values for its type, the behavior is undefined.

rici
  • 234,347
  • 28
  • 237
  • 341
  • On some machines, even with two's complement arithmetic, -32768 may be a trap value. Is there any guarantee that (-1 ^ 32767) must be well-defined? – supercat Jan 12 '13 at 04:04
  • @supercat: `(-1 ^ 32767)` is not well-defined because the bit-representation of `-1` is implementation-defined. As a more direct answer to your question, it is possible that a bitwise operation could produce a trap value; in that case, according to paragraph 4 of the introduction to section 5: "If during the evaluation of an expression, the result is... not in the range of representable values for its type, the behavior is undefined." – rici Jan 12 '13 at 18:58
  • To my mind, there should be some directives which would specify that a compiler must either choose a particular implementation for things like signed-vs-unsigned char, signed-integer format, etc. or yield a compilation error if the compiler couldn't do so. A block of code which started with a directive mandating two's-complement rules for the bitwise operators would then be portable to any machine which *either* used two's-complement format natively, or to any machine whose compiler could generate code that would compute the values that would be computed using two's-complement values. – supercat Jan 12 '13 at 20:09
  • Note that because `char`, `signed char`, and `unsigned char` are three distinct types, compilers in which `char` is signed are effectively compiling a different language from those in which it is unsigned; having a means by which code could specify which "language" it should be compiled with would help to prevent bugs that could result if code was compiled for a language other than the one for which it was written. – supercat Jan 12 '13 at 20:11
  • @supercat: "Optionally signed" chars were not one of the best ideas in C and the language probably would have ended up better without them. But that bus left forty years ago and it ain't gonna stop here again. – rici Jan 13 '13 at 00:19
  • In environments where sizeof(int)==1, `char` must be signed. In environments where some characters in the source character set have representations which exceed the maximum signed char value, `char` must be unsigned. My point was that if code is going to only work correctly if certain expectations about the compiler hold, and if it would be possible for a compiler to conform itself to different expectations, having a standard means of requesting that the compiler conform itself to fit such expectations would be useful. – supercat Jan 13 '13 at 00:23
-1

Results do not depend on how a machine represents integers.

0 & 0 = 0
0 & 1 = 0
1 & 0 = 0
1 & 1 = 1
0 | 0 = 0
0 | 1 = 1
1 | 0 = 1
1 | 1 = 1
0 ^ 0 = 0
0 ^ 1 = 1
1 ^ 0 = 1
1 ^ 1 = 0

The above table shows the value for one bit. Given a 32-bit integer, the bit computations are done for all 32-bits. Hence the term bit-wise.

brian beuning
  • 2,836
  • 18
  • 22