Out of order execution - can it bypass control statements?

Question

Regarding OOO, lets assume I have one process only (with one thread) that runs this code:

void foo() {

    if (x == 0) {
        return;
    }

    y->data = 5;

}

Now, lets assume I know that y is valid only if x is not zero. from hardware perspective, can the CPU execute y->data = 5 before reading x? It may cause the CPU to access a NULL/garbage pointer and crashs.

And if not, what is the reason for this? because if/while/for/goto are control statements and the CPU will not fetch ahead instructions when it sees a control statement?

A I remember, OOO should be 100% transparent to one thread executing its instructions.

No it can't. OOE must respect sequential semantics of the program. — Mysticial, Jul 02 '13 at 19:22
so this link from wikipedia stating that the first thread can bypass the while loop is wrong ? http://en.wikipedia.org/wiki/Memory_barrier ". Similarly, processor #1's load operations may be executed out-of-order and it is possible for x to be read before f is checked" — Itay Marom, Jul 02 '13 at 19:27
Memory barriers are different. They only apply when there are multiple threads. In your example, as long as there is no other thread writing to `x`, it will behave properly. — Mysticial, Jul 02 '13 at 19:31

score 2 · Accepted Answer · edited May 23 '17 at 12:11

Depends on how you look at it.

From the user's perspective, no.
From the CPU's perspective, yes.

From the user's perspective, the behavior of the program must be "as if" it was run sequentially.
In other words, there is no visible difference between being run sequentially and being run with OOE. (aside from maybe performance)

From the CPU's perspective, yes it actually can bypass the if-statement and execute y->data = 5;. But this is because of branch prediction rather than OOE.

On a modern processor, it is possible for the thread to mispredict the branch:

if (x == 0) {
    return;
}

and actually try to execute y->data = 5;...

If this happens and y is a bad pointer, it will get hardware exception, but that exception is withheld since the execution is still in speculation mode.

Once the thread realizes that it has mispredicted the branch, it will throw away everything past the branch (including the exception).

So in the end, there is nothing to worry. Even if the processor tries to do something it can't, it won't affect the sequential behavior.

In other words, a modern processor will clean up after itself if it makes a mess that isn't your fault.

Things get uglier when you have multiple threads, but that's outside the scope of this question.

thanks! the part i was missing is that even the hardware exceptions are witheld. i knew that results are submitted by a sequential mechansim but i didn't know about the hardware exceptions — Itay Marom, Jul 02 '13 at 21:08
Yep. Hardware exceptions are not passed up to the OS until the processor is absolutely sure that it is real. So not even the OS knows that the processor messed up and corrected itself. — Mysticial, Jul 02 '13 at 21:12

ouah · Answer 2 · 2013-07-02T20:04:49.180

The answer is: it depends.

The C Standard describes an abstract machine in which issues of optimizations are irrelevant. Except for access to volatile objects, the implementation is free to reorder statements if it does not change the observable behavior of the program. C11 came up with a definition for the observable behavior of the program:

(C11, 5.1.2.3p6) "The least requirements on a conforming implementation are:

— Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.

— At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.

— The input and output dynamics of interactive devices shall take place as specified in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.

This is the observable behavior of the program"

C11 also has this paragraph (already present in C99):

(C11, 5.1.2.3p10) "Alternatively, an implementation might perform various optimizations within each translation unit, such that the actual semantics would agree with the abstract semantics only when making function calls across translation unit boundaries. [...]"

So in your example, it will actually depend on how y is declared (i.e., what's its linkage?) and how it is used in the program.

Reordering is actually used in real life by compilers to optimize some part of the code if it sees it does not affect the observable behavior of the program.

Out of order execution - can it bypass control statements?

2 Answers2