39

C11 §6.5.7 Paragraph 5:

The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2*^E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

But, The viva64 reference document says:

int B;
B = -1 >> 5; // unspecified behavior

I ran this code on GCC and it's always give an output -1.

So, standard say's that "If E1 has a signed type and a negative value, the resulting value is implementation-defined", But that document say's that -1>>5; is unspecified behavior.

So, Is -1>>5; unspecified behavior in C? Which is correct?

dbush
  • 205,898
  • 23
  • 218
  • 273
msc
  • 33,420
  • 29
  • 119
  • 214

4 Answers4

39

Both are correct. Implementation defined behavior is a particular type of unspecified behavior.

Citing section 3.4.1 of the C standard which defines "implementation-defined behavior":

1 implementation-defined behavior

unspecified behavior where each implementation documents how the choice is made

2 EXAMPLE An example of implementation-defined behavior is the propagation of the high-order bit when a signed integer is shifted right.

From section 3.4.4 defining "unspecified behavior":

1 unspecified behavior

use of an unspecified value, or other behavior where this International Standard provides two or more possibilities and imposes no further requirements on which is chosen in any instance

2 EXAMPLE An example of unspecified behavior is the order in which the arguments to a function are evaluated.

As for GCC, you'll always get the same answer because the operation is implementation defined. It implements right shift of negative numbers via sign extension

From the GCC documentation:

The results of some bitwise operations on signed integers (C90 6.3, C99 and C11 6.5).

Bitwise operators act on the representation of the value including both the sign and value bits, where the sign bit is considered immediately above the highest-value value bit. Signed >> acts on negative numbers by sign extension.

As an extension to the C language, GCC does not use the latitude given in C99 and C11 only to treat certain aspects of signed << as undefined. However, -fsanitize=shift (and -fsanitize=undefined) will diagnose such cases. They are also diagnosed where constant expressions are required.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 1
    This is not quite correct: "Implementation defined behavior is a particular type of unspecified behavior". If a behavior is implementation-defined, the standard is specifying that the implementation must define and document it, so it's not unspecified. Unspecified behavior is for situations where the implementation is free to choose but does not have to document a behavior or make it consistent. – R.. GitHub STOP HELPING ICE Oct 16 '17 at 22:24
  • 10
    @R.. Except the definition of "implementation-defined behavior" actually uses the exact words "unspecified behavior". I guess requiring documentation is not considered imposing "further requirements on which is chosen". – aschepler Oct 16 '17 at 23:19
  • 1
    I would interpret the definition of "implementation-defined behaviour" that you quoted as being exclusive of "unspecified behaviour", i.e. meaning "as for unspecified behaviour, but also with this requirement". After all, the definition of "unspecified behaviour" includes "imposes no further requirements on which is chosen", however implementation-defined behaviour does impose a requirement that the implementation document the choice, therefore it does not meet the definition of "unspecified behaviour" – M.M Oct 19 '17 at 04:15
  • @M.M implementation-specified behaviour says: "documents on how the choice is made" - so an implementation can document it as "at random" and conform to the standard, and it is indistinguishable from any other undefined behaviour. – Antti Haapala -- Слава Україні Aug 27 '18 at 02:50
  • I provide [further evidence, a DR to C89](http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_154.html) :) – Antti Haapala -- Слава Україні Aug 27 '18 at 04:44
  • @AnttiHaapala: I hadn't seen that one. However, the C89 Rationale does say "While a deficient implementation could probably contrive a program that meets this requirement, yet still succeed in being useless, the C89 Committee felt that such ingenuity would probably require more work than making something useful." Thus, most questions about whether the Standard allowed a conforming implementation to behave in some obtuse fashion should have been answered "The Standard would probably allow a garbage-quality-but-conforming implementation to behave that way. So?" Too bad they weren't. – supercat Aug 27 '18 at 15:27
  • @AnttiHaapala: That's not to suggest that all such questions are about behaviors that would generally be obtuse. Most of them concern behaviors that would be sensible in some cases but obtuse in others. Unfortunately, the authors of the Standard, in replying to such questions, fail to make clear that the Standard often gives permission to behave a certain way in cases where it is obtuse as well as those where it is sensible; they expect that people writing seeking to write quality implementations for a particular purpose will be able to judge which cases are which. – supercat Aug 27 '18 at 16:19
  • Since evidently this thing is shambling back to life, or at least undeath, I'm inclined to agree with aschepler. The natural reading of "unspecified behavior where [...]" in the definition of implementation-defined behavior is that implementation-defined behavior is a sub-category of unspecified behavior. For example, if I say "a sack race is a race where every participant has their legs and lower torso in a large sack", I mean a sack race is a specific kind of race, not that a sack race isn't actually a race, but only *like* one. – John Bollinger Aug 27 '18 at 18:50
14

"Unspecified behavior" and "implementation defined" are not contradictory. It just means that the C standard does not specify what needs to happen, and that various implementations can do what they deem "correct."

Running it multiple times on one compiler and getting the same result only means that that particular compiler is consistent. You may get different results on a different compiler.

skrrgwasme
  • 9,358
  • 11
  • 54
  • 84
  • 1
    Neither term is a subset of the other. If an action invokes "unspecified" behavior, implementations are required to choose from among a finite set of choices (e.g. `x()+y()` must behave as though it either fully evaluates `x()` and then evaluates `y()`, or fully `y()` and then `x()`; those are the only two choices). If an action invokes "Implementation-Defined" behavior, implementations are required to document a specific behavior, but could do just about anything they like so long as they document it. – supercat Oct 16 '17 at 21:04
  • @supercat it says "document *how the choice is made*, not document *which* behaviour was chosen". – Antti Haapala -- Слава Україні Aug 27 '18 at 03:40
  • @AnttiHaapala: I don't think "Implementation-Defined" is intended as an invitation for implementations to simply say "This implementation selects among these behaviors unpredictably". The Standard poses no hard requirements upon the quality of implementations' documentation, but I think the clear implication would be that only low-quality implementations would fail to say something useful. – supercat Aug 27 '18 at 13:48
  • @supercat, I'm not seeing what in the definition of "unspecified behavior" requires the behavior to be chosen from a *finite* set of choices, unless it's that computers can represent only a finite number of distinct states. "Two or more" does not convey finiteness to me. – John Bollinger Aug 27 '18 at 17:55
  • @JohnBollinger: The places in the Standard where I've noticed the term used describe things that would have a finite number of choices, such as the order in which various operations are performed, or the values left in padding bytes of a structure when preceding members are written. It might have been possible and appropriate for Standard to use the term in cases where there might be an infinite number of possible behaviors that were all quite similar (e.g. saying that if an implementation pre-defines a certain macro, the behavior of integer overflow would be limited to yielding a value... – supercat Aug 27 '18 at 18:32
  • ...that behaves like any mathematical integer which is congruent to the correct value, mod the range of the integer type. Since there would be a countably infinite number of such values, there would thus be a countably infinite number of possible behaviors). I don't know of anyplace the Standard actually refers to "unspecified" behaviors, however, where it does not also specify a finite number of possibilities. Is there anyplace it does so that I haven't noticed? – supercat Aug 27 '18 at 18:35
2

Implementation-defined behaviour is a subclass of unspecified behaviour, i.e. behaviour that is not specified by the standard.

Defect Report #154 to C89 asked the committee what are the limits to the implementation-defined behaviour; the committee answers that an implementation can define any behaviour it wants, and that does not need to be constant.

What an implementation needs to do, is to document how this choice is made, as opposed to the other class of unspecified behaviour where a conforming implementation need not even bother telling how the choice is made, possibly because for these the majority of implementations the text would say "at random" or "depending on the compiler optimization level" or "depending on the register allocation for local variables".

  • The Standard avoids using the term "implementation-defined behavior" in cases where any significant number of quality implementations would have been expected to say "at random". The threshold between UB and IDB seems to generally be whether the authors of the Standard envisioned any plausible situations where some implementation might be unable to specify anything useful. Consider the behavior of `-1< – supercat Aug 27 '18 at 16:28
  • ...of UB could just as easily be IDB, but the possibility that there might exist some implementation where it might be hard to document a consistent behavior for `-1< – supercat Aug 27 '18 at 16:30
2

I don't get any of the present answers. The C standard clearly says that right-shifting a negative number is implementation-defined behavior. It is not unspecified behavior, which means something else. As you correctly cite (C17 6.5.7 §5):

The result of E1 >> E2 is E1 right-shifted E2 bit positions. /--/
If E1 has a signed type and a negative value, the resulting value is implementation-defined.

This means that the compiler must document how it behaves. Period.

In pratice: the document must tell if the compiler uses arithmetic right shift or logical right shift.


This is as opposed to unspecified behavior, which is implementation-specific behavior that does not need to be documented. Unspecified behavior is used in two cases:

  • When the compiler behavior might be an implementation secret that the compiler vendor should not be forced to reveal to their competitors.
  • When the compiler can't be bothered to document how the underlying details such as OS and RAM memory cells work.

For example, a compiler does not need to document the order of evaluation in code like this:

a  = f1() + f2();
a += f1() + f2();

Documenting the order in which the sub-expressions are evaluated would reveal details about how the compiler's internal expression tree and optimizer work, which in turn would reveal why a compiler produces better code or compiles faster than the competition. This was a big thing when the C standard was originally written. Less so nowadays when there's some great open-source compilers, so it is no longer a secret.

Similarly, a compiler does not need to document what this code prints:

int a;
int ptr = &a;
printf("%d", *ptr);

a is an indeterminate value and the output is unspecified - in practice the output depends on what was stored in that particular RAM cell before. What we would call a "garbage value". (Before yelling "UB", see (Why) is using an uninitialized variable undefined behavior?).

Lundin
  • 195,001
  • 40
  • 254
  • 396