1

I was fooling around and found that the following

#include <stdio.h>

void f(int& x){
    x+=1;
}

int main(){
    int a = 12;
    f(a);
    printf("%d\n",a);
}

when translated by g++ (Ubuntu 4.8.4-2ubuntu1~14.04.3) 4.8.4 with g++ main.cpp -S produces this assembly (showing only the relevant parts)

_Z1fRi:
    pushq   %rbp
    movq    %rsp, %rbp
    movq    %rdi, -8(%rbp)
    movq    -8(%rbp), %rax
    movl    (%rax), %eax
    leal    1(%rax), %edx
    movq    -8(%rbp), %rax
    movl    %edx, (%rax)
    popq    %rbp
    ret
main:
    pushq   %rbp
    movq    %rsp, %rbp
    subq    $16, %rsp
    movl    $12, -4(%rbp)
    leaq    -4(%rbp), %rax
    movq    %rax, %rdi
    call    _Z1fRi
    movl    -4(%rbp), %eax
    movl    %eax, %esi
    movl    $.LC0, %edi
    movl    $0, %eax
    call    printf
    movl    $0, %eax
    leave
    ret

Question: Why would the compiler choose to use leal instead of incq? Or am I missing something?

Michael Petch
  • 46,082
  • 8
  • 107
  • 198
Adam
  • 1,342
  • 7
  • 15
  • 2
    Please look at the optimised code. And update your post if there are differences. – Richard Critten Jul 18 '17 at 16:06
  • How would it use `incq` in this case? `incq` is unary. And the compiler needs `rdx = rax + 1`. Maybe the real question is: why did it choose to involve two registers where one would be sufficient (with `incq`)? Is this optimized code? Apparently not. – AnT stands with Russia Jul 19 '17 at 01:03
  • 1
    `lea` is also useful in that it doesn't affect any flags register values, which is *very* useful to x86[-64] code, where so many instructions do. It's just a better paradigm. `inc` also introduced other issues like partial flags register stalls. The real question is *why* would you use `inc` instead of `lea` here? – Brett Hale Jul 19 '17 at 08:08

1 Answers1

5

You compiled without optimization. GCC does not make any effort to select particularly well-fitting instructions when building in "debug" mode; it just focuses on generating the code as quickly as possible (and with an eye to making debugging easier—e.g., the ability to set breakpoints on source code lines).

When I enable optimizations by passing the -O2 switch, I get:

_Z1fRi:
    addl    $1, (%rdi)
    ret

With generic tuning, the addl is preferred because some Intel processors (specifically Pentium 4, but also possibly Knight's Landing) have a false flags dependency.

With -march=k8, incl is used instead.

There is sometimes a use-case for leal in optimized code, though, and that is when you want to increment a register's value and store the result in a different register. Using leal in this way would allow you to preserve the register's original value, without needing an additional movl instruction. Another advantage of leal over incl/addl is that leal doesn't affect the flags, which can be useful in instruction scheduling.

Cody Gray - on strike
  • 239,200
  • 50
  • 490
  • 574
Florian Weimer
  • 32,022
  • 3
  • 48
  • 92
  • Fun fact: `-O0` code is horrible on purpose because it wants to let you *modify* variables in memory with a debugger while stopped at any breakpoint, and still have the program execute as written. This is why *everything* is always spilled to memory after a statement (or source line?), and reloaded after. Debug-info formats unfortunately can't specify that a variable is currently live in a register. – Peter Cordes Jul 20 '17 at 03:14