11

I'm aware that undefined behavior can potentially cause anything, which makes any program containing UB potentially meaningless. I was wondering if there is any way to identify the earliest point in a program that undefined behavior could cause problems. Here is an example to illustrate my question.

void causeUndefinedBehavior()
{
   //any code that causes undefined behavior
   //every time it is run
   char* a = nullptr;
   *a;
}


int main()
{
 //code before call
 //...
 causeUndefinedBehavior();
 //code after call
 //...
}

From my understanding, the possible times undefined behavior could be evoked (not necessarily manifested) are:

  1. When causeUndefinedBehavior() is compiled.
  2. When main() is compiled.
  3. At the time the program is run.
  4. At the time causeUndefinedBehavior() is executed.

Or is the point where undefined behavior is evoked completely different for every case and every implementation?

In addition, if I commented out the line where causeUndefinedBehavior() is called, would that eliminate the UB, or would it still be in the program since code containing UB was compiled?

Elliot Hatch
  • 1,038
  • 1
  • 12
  • 26

5 Answers5

4

As your code somewhat demonstrates, undefined behavior is almost always a condition of runtime state at the time the behavior is attempted. A slight modification of your code can make this painfully obvious:

void causeUndefinedBehavior()
{
   //any code that causes undefined behavior
   //every time it is run
   char* a = nullptr;
   *a;
}


int main()
{
 srand(time(NULL));
 //code before call
 //...
 if (rand() % 973 == 0)
    causeUndefinedBehavior();
 //code after call
 //...
}

You could execute this a thousand times or more and never trip the UB execute-condition. that doesn't change the fact the function itself is clearly UB, but detecting it at compile time in context of the invoker is not trivial.

WhozCraig
  • 65,258
  • 11
  • 75
  • 141
  • 2
    Hmm, but isn't it theoretically possible for the compiler to decide that since `causeUndefinedBehaviour()` causes undefined behaviour, then it must never be called in a correct program, and so decide that it is safe to discard the `if`. – Mankarse Oct 29 '12 at 00:37
  • 2
    @Mankarse a very good point. Apply what is in main() into causeUndefinedBehaviour() (call it mayCauseUndefinedBehavior()). I.e if it to also conditionally may have UB as well... the point is without *running* you can't *know* (a full throw-out-interpretation not withstanding, which is 'running' as far as I'm concerned). – WhozCraig Oct 29 '12 at 00:44
3

"Undefined behavior" means that the language definition doesn't tell you what your program will do. That's a very simple statement: no information. You can speculate all you like about what your implementation may or may not do, but unless your implementation documents what it does, you're only guessing. Programming isn't about guessing; it's about knowing. If the behavior of your program is undefined, fix it.

Pete Becker
  • 74,985
  • 8
  • 76
  • 165
  • 2
    This is all true, but doesn't answer the OP's question. – Jonathon Reinhart Oct 30 '12 at 04:51
  • @JonathonReinhart - right. It doesn't answer the question because the question has no answer. – Pete Becker Oct 30 '12 at 12:29
  • The simplest answer is "at runtime of the undefined code in question." – Jonathon Reinhart Oct 30 '12 at 13:13
  • 2
    @JonathonReinhart - that's simple, but it's not correct. The language definition says that the behavior of a **program** is undefined if it does various things. It does not say that you can run the program up to the point where you've done something wrong. It doesn't require, even, that the program compile and produce an executable. Yes, there are arguments akin to how many angels can dance on the head of a pin, but undefined behavior means that the behavior of a program is undefined. Full stop. – Pete Becker Oct 30 '12 at 14:18
  • 1
    A program may exhibit undefined behavior on some or all inputs. If it is undefined only for some inputs, the behavior is still defined on the remaining inputs. – Demi Jul 06 '13 at 05:59
  • 1
    @Demetri: saying something doesn't make it true. Before you tell the world Pete Becker knows less about C++ that you do, find some actual supporting evidence. – Tony Delroy Jun 12 '14 at 16:08
  • @TonyD example: int main(int argc, char** argv) { if (argc == 1) { void* x = 0; *x = 0 /* undefined behavior */ } else { return 0; /* well-defined */ } } has undefined behavior if and only if argc == 1 – Demi Jun 12 '14 at 16:34
  • @Demetri: see above - "saying something doesn't make it true". If you have `*x` in your code and the compiler knows statically `x` is `0`, it may not continue the process of generating the naively implied code it would when unaware of the runtime value of `x`. Given the behaviour in undefined, it's free to not generate an object/executable, or to generate one with bogus machine code with or without a warning. That's my assertion - I hope it's clear - if you simply disagree without having some section of the Standard or a FAQ from Stroustrup or similar then let's agree to disagree.... – Tony Delroy Jun 13 '14 at 04:12
2

I think it depends on the type of undefined behavior. Things that would affect something like structure offsets could cause undefined behavior, which would show up any time code that touches that structure is executed.

In general, however, most undefined behavior happens at run time, meaning only if that code is executed will the undefined behavior occur.

For example, an attempt to modify a string literal has undefined behavior:

char* str = "StackOverflow";
memcpy(str+5, "Exchange", 8);    // undefined behavior

This "undefined behavior" will not take place until the memcpy executes. It will still compile into perfectly sane code.

Another example is omitting the return from a function with a non-void return type:

int foo() {
    // no return statement -> undefined behavior.
}

Here, it is at the point at which foo returns that the undefined behavior occurs. (In this case, on x86, whatever happened to be in the eax register is the resultant return value of the function.)

Many of these scenarios can be identified by enabling the a higher level of compiler error reporting (eg. -Wall on GCC.)

Jonathon Reinhart
  • 132,704
  • 33
  • 254
  • 328
  • 2
    `-Wall` is not the maximum level of compiler error reporting. That "all" suffix doesn't mean all warnings. There are many warnings that are not enabled by `-Wall`. A better view is that `-Wall` is the minimum level of reasonable error reporting. – David Hammen Oct 29 '12 at 00:54
  • Thanks David. I changed the wording, but I like your view as well. – Jonathon Reinhart Oct 29 '12 at 00:55
1

while it is "undefined behaviour", given a particular compiler, it will have a predictable behavior of some sort. But because it is undefined, on different compilers, it may result in that behavior occurring at any point of the complilation / runtime

Keith Nicholas
  • 43,549
  • 15
  • 93
  • 156
1

which makes any program containing UB potentially meaningless

Not quite right. A program can't "contain" UB; when we say "UB" that is short for: the program's behaviour is undefined. All of it!

So the program is not just potentially, but actually, meaningless, from the start.

[intro.execution]/5: A conforming implementation executing a well-formed program shall produce the same observable behavior as one of the possible executions of the corresponding instance of the abstract machine with the same program and the same input. However, if any such execution contains an undefined operation, this International Standard places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation).

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055