-2

In C++, there are things that come up that are somewhere between well-defined and undefined. Specifically, those are called implementation defined and unspecified. Right now, I'm interested in the unspecified stuff.

When is it okay to use such features, and when should they be avoided? Are there good examples of unspecified behaviour being a part of correct code? When, if ever, is it the best choice to make when writing software?

Definitions provided by Matt McNabb:

  • Undefined - anything at all can happen

  • Implementation-defined - a finite number of results are possible, and the compiler's documentation must say what happens

  • Unspecified - a finite number of results are possible -- usually the Standard describes the set of possible results

  • Well-defined - none of the above

  • Well-formed program - program that compiles without error (may exhibit undefined behaviour)

Follow-up question:

Do relaxed atomics count as unspecified or well-defined?

Marked as a duplicate of a question that talks about the same idea from a different perspective. The question marked as the same talks about the definition of unspecified behaviour, whereas here the question is about how and when to use it.

AstroCB
  • 12,337
  • 20
  • 57
  • 73
Michael Gazonda
  • 2,720
  • 1
  • 17
  • 33
  • 5
    Do you mean "implementation-defined"? If not, I can't make sense of the question. – Jon Jul 17 '14 at 20:46
  • 2
    It seems like there is only well-defined and "other". – user2864740 Jul 17 '14 at 20:46
  • You mean like `int arr[5]{}; someCondition ? arr[0] : arr[10];`? Sorry, it's the only one I can think of off the top of my head and it's not very useful for anything other than categorization. – chris Jul 17 '14 at 20:46
  • you have to give a concrete example otherwise this question is too braod. – Ahmed Masud Jul 17 '14 at 20:48
  • Not everything that isn't "well-defined" is void of definition, and as such may have usefulness. Examples are left out on purpose, as I expect they would have a large influence on the answers. – Michael Gazonda Jul 17 '14 at 20:48
  • 4
    @MGaz: Your example is not "without any definition", it is standard undefined behavior. – Jon Jul 17 '14 at 20:55
  • @Jon - There are limits to what can happen or not happen. As such, it is somewhere between defined and undefined. – Michael Gazonda Jul 17 '14 at 20:57
  • 4
    @MGaz: No, there are no limits. The definition of undefined behavior is that anything at all can happen. – Jon Jul 17 '14 at 20:58
  • 1
    @Jon [see here for some discussion](http://stackoverflow.com/questions/20922609/why-does-optimisation-kill-this-function/20956250#20956250) it is not clear if this is undefined in C++ but it is well defined in C. In practice type punning via a union is well supported. – Shafik Yaghmour Jul 17 '14 at 20:58
  • This is not undefined behavior because it won't even compile. – Brandin Jul 17 '14 at 20:58
  • @Brandin - fixed, was missing semicolon after union declaration – Michael Gazonda Jul 17 '14 at 21:00
  • 3
    I still have no idea what is "well-defined" or "undefined" in this context. There is no such thing. There is "implementation-defined" as @Jon pointed out. That is really what I think you're talking about here. Relying on implementation-defined behaviour would make sense if you're targeting a specific implementation. – Brandin Jul 17 '14 at 21:02
  • @Jon - You're right that anything can happen in undefined behaviour. I disagree that this fits that category. It's somewhere in-between. If you can prove otherwise, I'm all ears. – Michael Gazonda Jul 17 '14 at 21:02
  • The language does not specify what the compiler should do here, so it's not "defined". Yet, it *will* produce consistent results -- irrespective of the endianness of your computer! That is, only within well-defined border requirements, e.g., "this code is for Big-Endian CPUs only", – Jongware Jul 17 '14 at 21:03
  • @MGaz - Nope, it still doesn't compile. Maybe you mean `union { ... } a = {23};` and then you try to type pun access `a.b` – Brandin Jul 17 '14 at 21:04
  • 1
    @MGaz: Open a copy of the standard and read 3.10/10. It lists a series of cases that are legal; everything else is undefined behavior by definition. Including (to me at least) the code above, because it doesn't fit any of the legal cases. – Jon Jul 17 '14 at 21:07
  • Added a big update, hope you'll re-open – Michael Gazonda Jul 17 '14 at 21:09
  • Go to the top of the comments thread and read that comment. Then look at your question again. Which of your examples are implementation defined? – Brandin Jul 17 '14 at 21:13
  • @MGaz: The rule is a simple tautology, even though it is also in the standard. Paraphrased: Where the standard does not impose any requirements, it does not impose any requirements. – Deduplicator Jul 17 '14 at 21:13
  • @Jon: Aliasing to `char` arrays is a known exception to that rule, IIRC. – Lightness Races in Orbit Jul 17 '14 at 21:14
  • @Brandin - implementation defined can be an answer to my question, but not the _only_ answer to the question. – Michael Gazonda Jul 17 '14 at 21:16
  • @Deduplicator - There are places where there _are_ requirements, though those requirements aren't enough to qualify for "well-defined". – Michael Gazonda Jul 17 '14 at 21:17
  • 1
    Consider this program `int main() { int x; x *= 0; return x; }`. The behaviour of this program is clearly undefined, but I would be surprised to find any implementation that did not produce an executable that always returns 0. – Brandin Jul 17 '14 at 21:18
  • 1
    @brandin you've obviously not met a modern optimising compiler. – Flexo Jul 17 '14 at 21:23
  • @Flexo - anything * 0 = 0, how could that ever result in anything other than 0 being returned? x starts as "undefined", but an undefined int * 0 must be 0. There is no case where the answer could be something else. Undefined value != undefined behaviour. – Michael Gazonda Jul 17 '14 at 21:25
  • @Brandin: Many implementations will optimize away the multiplication, returning whatever garbage was in `x` before the function. That can result in a crash (see Itanium architecture, NaT) – Deduplicator Jul 17 '14 at 21:26
  • 1
    @Flexo Yes I suppose the optimizer technically could "take advantage" of the undefined behaviour and freely optimize out the function body. I would still be surprised to find the compiler that performs such an optimization in this case. – Brandin Jul 17 '14 at 21:27
  • Would optimizing out the multiplication there be a compiler bug? – Michael Gazonda Jul 17 '14 at 21:28
  • 1
    @MGaz No that is not correct. Accessing an uninitialized variable is undefined behaviour. It means the implementation can do anything it wants. Maybe I want to build my compiler to throw an exception if you do that. Just an example. – Brandin Jul 17 '14 at 21:28
  • @Brandin that's true if you're thinking like a mathematician, but that's not the rules compilers live by. A modern compiler can trivially prove that use of a variable is always undefined. By the rule of the standards that can and does explicitly permit any result. Not just the results that some random person on the internet postulated would make sense, but anything. So the compiler can and does legally propagate that undefinedness to do what it considers "optimal". See: http://css.csail.mit.edu/stack/ for a slightly twisted view of the implications of that. – Flexo Jul 17 '14 at 21:29
  • Read the linked SO question, that is along similar lines to what you brought up. Btw no one can give a proper answer here because the answer is unspecifiable as it is now. What's more, my brain produces undefined behaviour if it tries to read or write an unspecifiable answer to a question, especially with not enough sleep. Good night ladies/gentleman. I suggest moving on to some real questions or to bed. – Brandin Jul 17 '14 at 21:51
  • @MGaz: http://stackoverflow.com/questions/11373203/accessing-inactive-union-member-undefined Writing to one member of a union then reading from another is disallowed by section 9.5 of the C++ standard. It's without any definition because it is not allowed. – Mooing Duck Jul 17 '14 at 21:54
  • @MooingDuck - read the rest of the answer there. – Michael Gazonda Jul 17 '14 at 21:55
  • @MGaz: Answer #1:"access without one of the above [tricks/quirks] is undefined behaviour." Answer #2:"If only one value is stored, how can you read another? It just isn't there." Answer #3:"these both carry a strong implication that "inspecting" (reading) a member is "permitted" only if 1) it is (part of) the member most recently written, or 2) is part of a common initial sequence." (This gives you some credence) Answer #4:" In this case it might be undefined behavior by lack of specification." Answer #5:"I can find nowhere in the Standard that explicitly forbids it"(this supports you too) – Mooing Duck Jul 17 '14 at 22:00
  • @MGaz As others already mentioned, there's nothing ***Somewhere in between***. Dereferencing a `nullptr` is undefined behavior, period. – πάντα ῥεῖ Jul 17 '14 at 22:01
  • @πάντα ῥεῖ - **I specifically said dereferencing a nullptr was undefined.** Not sure why you're using that as an argument against my question. – Michael Gazonda Jul 17 '14 at 22:03
  • @πάνταῥεῖ Mine is better, though strictly speaking the answer is: Avoid them whenever possible, and be sure your implemenatation guarantees to do what you want there, resp. you are fine with how any implementation could do it. Liberally insert static assertions into your code testing those assumptions. – Deduplicator Jul 17 '14 at 22:07
  • note: more detail about the different behaviour types, http://stackoverflow.com/questions/2397984/undefined-unspecified-and-implementation-defined-behavior – M.M Jul 17 '14 at 22:45
  • Still not a duplicate. This is about when to use unspecified behaviour, not what it is. – Michael Gazonda Jul 17 '14 at 23:28

1 Answers1

6

To answer the new question, "When is it OK to use unspecified behaviour?"

It may sound slightly facetious, but "any time it doesn't matter to you which option happens".

For example,

int foo() { cout << "foo"; return 1; }
int bar() { cout << "bar"; return 2; }
// ...
cout << (foo() + bar()) << "\n";

If you don't care whether you see "foobar3" or "barfoo3" then you could write this code. If it does matter then you would have to change it, e.g.

int i = foo(); i += bar(); cout << i << "\n";

The order is unspecified because it's good to leave freedom for the compiler to choose whichever order is optimal, in a more general case.

M.M
  • 138,810
  • 21
  • 208
  • 365