why "guarded do" is better than "jump to middle" in asm

Question

I was reading a textbook which describes two way of translating a while loop into machine code:

//jump to middle
   goto test;
loop:
   body-statement
test:
   t = test-expr;
   if (t)
       goto loop;

and

//guarded do

t = test-expr;
if (!t)
   goto done;
loop:
   body-statement
   t = test-expr;
   if (t)
      goto loop;
done:

I don't know why using the secoond aproach, the compiler can optimize the initial test?

I suppose it's about avoiding the unconditional jump (`goto test`). — Hulk, Jul 13 '20 at 06:11
can also be `loop: t = test-expr; if (!t) goto done; body-statement goto loop;` — bruno, Jul 13 '20 at 06:13
Since this is about "machine code" it's not a compiler optimization thing. You would never write C like this. — nickelpro, Jul 13 '20 at 06:35
Related question: https://stackoverflow.com/questions/47783926/why-are-loops-always-compiled-into-do-while-style-tail-jump — njuffa, Jul 13 '20 at 07:17
@nickelpro well, it could be about optimization if it turns out that the second version is preferred because of branch prediction or something like that. From the context of the question i would of course assume that the presented code is meant to be pseudocode for the generated assembler, not the actual C code fed to the compiler. — Felix G, Jul 13 '20 at 07:22
I am not a compiler engineer, but my understanding from working with compiler engineers in the past is that single-entry single-exit (SESE) control flow primitives are preferred by the optimizing phases of compilers (or at least that used to be true in the past). The "jump to middle" approach results in a loop with two entry points, so not SESE-style control flow. — njuffa, Jul 13 '20 at 07:34
@njuffa: I think the main downside is just the unconditional jump on all paths through the function. Peeling the before-first-iteration test to make `if(run_at_all) do_loop` avoids that by duplicating the test code, instead of jumping to a single copy of it. (Sometimes it can then be optimized away, many loops have a known trip-count, or at least known non-zero.) — Peter Cordes, Jul 13 '20 at 10:02
@njuffa: Neither of these are proposed as actual C *input* to a compiler, which is where compilers would care about SESE. It's just using C as pseudo-code for asm. Generating a non-SESE asm loop from a C SESE `while(t){}` loop is fine. — Peter Cordes, Jul 13 '20 at 10:07
@PeterCordes I was referring to intermediate representations inside compilers, nothing to do with HLL source code. I seem to recall a SESE-fying phase for loops in some compilers, because auto-vectorization is easier to apply to a loop in SESE-form. Presumably it is this sort of compiler phase that also transforms while-type loops into do-type loops plus a pre-check. But this is really not my area of expertise. — njuffa, Jul 13 '20 at 10:18
@njuffa: me neither, I usually only look at what goes in and what comes out, with only a rough idea of how the sausage is made in the middle. I'd guess that peeling the check happens really late, after any steps that care about a loop being SESE, otherwise compilers would be self-defeating because all good compilers do that optimization for `while(t){...}`. Or happens really early and then later passes just look at the SESE do-while loop. Either way, most optimizations don't depend on humans writing `do{}while()` loops. (Removing the test before the first iter is nice though). — Peter Cordes, Jul 13 '20 at 10:34

score 1 · Accepted Answer · answered Jul 13 '20 at 10:10

1

I don't know why using the secoond aproach, the compiler can optimize the initial test?

This is what the asm / machine code might look like after the compiler is done, using C as pseudo-code for asm. (It's not useful to write C like this.) There's no further optimization / compilation happening after we get to this step.

In actual C you'd just write while(t) or for(int i=0 ; i<n ; i++) like a normal person (an idiomatic C loop), and let the compiler create an idiomatic asm loop (conditional branch at the bottom). That's what compilers are for.

Related: Why are loops always compiled into "do...while" style (tail jump)?

The main downside of jumping into the middle is that it puts an extra unconditional jump on all paths through the function. (Whether the loop runs zero iterations or not).

Peeling the before-first-iteration test to make if(run_at_all) do_loop avoids that jump by duplicating the test code, instead of jumping to a single copy of it.

Sometimes it can then be optimized away: many loops have a known trip-count, or at least known non-zero if the compiler can prove anything about t for the first iteration.

answered Jul 13 '20 at 10:10

Peter Cordes

328,167
45
605
847

sorry I still don't get it. Also, there are only one test-expr in "jump to middle" but there are two test-exprs in the "guarded do", isn't that "jump to middle" more efficient? – Jul 13 '20 at 11:16
@amjad: There's more detail on this in [Why are loops always compiled into "do...while" style (tail jump)?](https://stackoverflow.com/q/47783926) Look at how many instructions run overall, i.e. dynamic instruction count. You're I think only looking at *static* instruction count (how many are in the file / memory). Yes, peeling the first condition-check trades size for speed, a similar tradeoff to unrolling. – Peter Cordes Jul 13 '20 at 11:18
2

@amjad: On old-fashioned processors one tended to assume that *taking* a jump was always expensive, whether conditional or unconditional; the extra test and conditional jump which is usually *not* taken could thus be cheaper than the unconditional jump which is *always* taken. On modern CPUs with branch prediction, this is not really true anymore. – Nate Eldredge Jul 13 '20 at 13:23
1

The "guarded-do" approach has an additional advantage over "jump-to-middle", it introduces a location outside the loop that will executed if and only if the loop is executed. This will give the compiler a location to place things like invariant computations and loop setup code. – Johan Jul 14 '20 at 12:39

why "guarded do" is better than "jump to middle" in asm

1 Answers1