Background
I've assumed for a while that gcc will convert while-loops into do-while form. (See Why are loops always compiled into "do...while" style (tail jump)?)
And that -O0
for while-loops...
while (test-expr)
body-statement
..Will generate code on the form of jump-to-middle do-while
goto test;
loop:
body-statement
test:
if (test-expr) goto loop;
And gcc -O2 will generate guarded do while
if (test-expr)
goto done;
loop:
body-statement
if (test-expr) goto loop;
done:
Concrete examples
Here are godbolt examples of functions for which gcc generates the kind of control flow I'm describing above (I use for-loops but a while loop will give the same code).
This simple function...
int sum1(int a[], size_t N) {
int s = 0;
for (size_t i = 0; i < N; i++) {
s += a[i];
}
return s;
}
Will for -O0
generate this jump to middle code
```sum1:
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-24], rdi
mov QWORD PTR [rbp-32], rsi
mov DWORD PTR [rbp-4], 0
mov QWORD PTR [rbp-16], 0
jmp .L2
.L3:
mov rax, QWORD PTR [rbp-16]
lea rdx, [0+rax*4]
mov rax, QWORD PTR [rbp-24]
add rax, rdx
mov eax, DWORD PTR [rax]
add DWORD PTR [rbp-4], eax
add QWORD PTR [rbp-16], 1
.L2:
mov rax, QWORD PTR [rbp-16]
cmp rax, QWORD PTR [rbp-32]
jb .L3
mov eax, DWORD PTR [rbp-4]
pop rbp
ret
Will for -O2 generate this guarded-do code.
sum1:
test rsi, rsi
je .L4
lea rdx, [rdi+rsi*4]
xor eax, eax
.L3:
add eax, DWORD PTR [rdi]
add rdi, 4
cmp rdi, rdx
jne .L3
ret
.L4:
xor eax, eax
ret
My question
What I'm after is hand-wavy rule to apply when looking at -Os
loops. I'm more used to looking at -O2
code and now that I'm working in the embedded field where -Os
is more prevalent, I'm surprised by the form of loops I see.
It seems that gcc -Og
and -Os
both generate code as a jmp
at a bottom and if() break
at the top. Clang on the other hand generated guarded-do-while A godbolt link to gcc and clang output
Here is an example of gcc -Os output for the above function:
sum1:
xor eax, eax
xor r8d, r8d
.L2:
cmp rax, rsi
je .L5
add r8d, DWORD PTR [rdi+rax*4]
inc rax
jmp .L2
.L5:
mov eax, r8d
ret
- Am I correct in assuming that gcc
-Og
and-Os
generates code on the form I described above? - Does anyone have a resource that describes the rationale for using while-form for
-Og
and-Os
? Is it by design or an accidental fall-out form the way the optimization passes are organized. - I thought that converting loops into do-while form was part of the early canonicalization done by compilers? How come gcc
-O0
generates do-while but gcc-Og
gives while-loops? Do that canonicalization only happen when optimization is enabled?
Sidenote: I'm surprised by how much code generated with -Os and -O2 differ given that there aren't many compiler flags that are different. Maybe many passes checks some variable for tradeoff_speed_vs_space
.