1

Does gcc optimize code such as:

if(cond)
    func1()
func2()
if(cond)
    func3()

If cond is constant? Also, what are good resources for learning more about C optimizations? (So I don't have to ask more stupid questions like this)

Klas. S
  • 650
  • 9
  • 21
  • By "invariant" you mean "constant"? – Eugene Sh. Jan 30 '17 at 22:15
  • 1
    Depends. Does `func2()` modify `cond`? Can `gcc` *prove* that it does not? – EOF Jan 30 '17 at 22:15
  • Verify by looking at the assembly output. (I'm 100% confident it does optimize this.) I assume by "invariant" you mean "constant expression." – cadaniluk Jan 30 '17 at 22:15
  • Yes, I guess it is invariant when it is in a loop – Klas. S Jan 30 '17 at 22:16
  • 1
    You might be interested in [likely / unlikely](http://stackoverflow.com/questions/109710/likely-unlikely-macros-in-the-linux-kernel-how-do-they-work-whats-their) – David Ranieri Jan 30 '17 at 22:19
  • If it can prove func1 or func2() won't change `cond`, it does. Try it our for yourself: https://godbolt.org/g/Hrkt0S – Petr Skocik Jan 30 '17 at 22:23
  • If I understand the question correctly, you want to know if the code will be optimised when `cond` is some simple boolean expression which may be either true or false at runtime, and which will have the same value during both evaluations, right? In which case, cond is _not_ constant. – davmac Jan 30 '17 at 23:01

3 Answers3

4

Yes compilers will certainly optimize this if they prove that it is allowed and if they find it worthwhile.

I tried on recent version of gcc, icc and clang and they all optimized the above code to, effectively:

if (cond) {
  func1();
  func2();
  func3();
} else {
  func2();
}

That is, they duplicated the call to func2() to enable this optimization. Here's a typical example of the compiled code, from gcc:

        cmp     edi, 33
        je      .L7
        xor     eax, eax
        jmp     func2
.L7:
        sub     rsp, 8
        xor     eax, eax
        call    func1
        xor     eax, eax
        call    func2
        xor     eax, eax
        add     rsp, 8
        jmp     func3

Of course, such an optimization isn't guaranteed - a compiler may decide it isn't worth it, and it may depend on compiler-specific heuristics and compile options. For example, when using -Os (as opposed to -O2 which I used above), gcc no longer re-arranges it and instead compiles it more or less as you've written it, with two cmp instructions:

        cmp     edi, 33
        jne     .L5
        xor     eax, eax
        call    func1
.L5:
        xor     eax, eax
        call    func2
        cmp     edi, 33
        jne     .L4
        xor     eax, eax
        jmp     func3
.L4:
        ret

On the other hand, both clang and icc continue to compile it with a single comparison while duplicating the call.

You can play with all of this on godbolt.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
1

Generally yes. Common sub expressions are calculated at function scope, so if we have if( x * x / y > z + z) appearing twice, x, y and z unchanging, the expression will only be calculated once. However the compiler needs to know, for example that the address of the variables has not been taken and passed to a subroutine somewhere.

Malcolm McLean
  • 6,258
  • 1
  • 17
  • 18
1

Yes.

You can check what a compiler is doing by looking at its assembly code. I've created a little C program to work with, making sure it has some variability so the optimizer doesn't just constant fold everything.

#include <stdio.h>

void func1(const char input) {
    printf("func1: %c\n", input);
}

void func2(const char input) {
    printf("func2: %c\n", input);
}

void func3(const char input) {
    printf("func3: %c\n", input);
}

int main() {
    char input = (char)getchar();

    if(input == 'Y') {
        func1(input);
    }

    func2(input);

    if(input == 'Y') {
        func3(input);
    }
}

You can generate assembly with -S.

gcc -S test.c -o test.asm

Note that this is without optimizations. You can find the comparison without having to read much assembly, look for 89 which is the ASCII decimal representation of Y.

LCFI10:
        subq    $16, %rsp
        call    _getchar
        movb    %al, -1(%rbp)
        cmpb    $89, -1(%rbp)
        jne     L5
        movsbl  -1(%rbp), %eax
        movl    %eax, %edi
        call    _func1
L5:
        movsbl  -1(%rbp), %eax
        movl    %eax, %edi
        call    _func2
        cmpb    $89, -1(%rbp)
        jne     L6
        movsbl  -1(%rbp), %eax
        movl    %eax, %edi
        call    _func3
L6:
        movl    $0, %eax
        leave

That's a fairly rote translation of the two if blocks. You can see two cmpb $89, -1(%rbp) calls indicating the comparison is being done twice.

Now with optimizations: gcc -S -O3 test.c -o test.asm

LCFI0:
        call    _getchar
        cmpb    $89, %al
        je      L10
        leaq    lC1(%rip), %rdi
        movsbl  %al, %esi
        xorl    %eax, %eax
        call    _printf
L7:
        xorl    %eax, %eax
        addq    $8, %rsp
LCFI1:
        ret
L10:
LCFI2:
        leaq    lC0(%rip), %rdi
        movl    $89, %esi
        xorl    %eax, %eax
        call    _printf
        movl    $89, %esi
        xorl    %eax, %eax
        leaq    lC1(%rip), %rdi
        call    _printf
        movl    $89, %esi
        xorl    %eax, %eax
        leaq    lC2(%rip), %rdi
        call    _printf
        jmp     L7

Now there's only one cmpb $89, %al, but there's also multiple movl $89, %esi. lC0, lC1, and lC2 are the strings "func1: %c\n", "func2: %c\n" and "func3: %c\n" cooresponding to the three functions.

For comparison, here's what clang does.

Ltmp12:
        .cfi_offset %rbx, -24
        callq   _getchar
        movsbl  %al, %ebx
        movzbl  %bl, %eax
        cmpl    $89, %eax
        jne     LBB3_2
## BB#1:
        leaq    L_.str(%rip), %rdi
        xorl    %eax, %eax
        movl    %ebx, %esi
        callq   _printf
        leaq    L_.str.1(%rip), %rdi
        xorl    %eax, %eax
        movl    %ebx, %esi
        callq   _printf
        leaq    L_.str.2(%rip), %rdi
        jmp     LBB3_3
LBB3_2:
        leaq    L_.str.1(%rip), %rdi

As with gcc there's now only one comparison. And, similarly, L_.str, L_.str.1, and L.str.2 are the strings in func1, func2, and func3 respectively.

Both have, essentially, changed the code to be this.

if( input == 'Y' ) {
    printf("func1: %c\n", input);
    printf("func2: %c\n", input);
    printf("func3: %c\n", input);
}
else {
    printf("func2: %c\n", input);
}
Schwern
  • 153,029
  • 25
  • 195
  • 336