Why does GCC redundantly load the address into rax here?

Question

I noticed that when optimizations are turned off, GCC (version 13.1.1 20230714) when asked to assemble the below C program, loads the address of the string (i.e., .string "HI" in the assembler output) into rax and then into rdi before calling the function -

#include <stdio.h>

int main() {
    printf("HI\n");
    return 0;
}

The assembler output with optimizations turned off (-O0) -

    ...
    lea rax, .LC0[rip]
    mov rdi, rax
    call    puts@PLT
    ....

And with optimizations turned on (for -O1, -O2, and -O3) -

    ...
    lea rdi, .LC0[rip]
    call    puts@PLT
    mov eax, 0
    ...

I understand that I can expect the versions with optimizations to do things more...optimally. But, there is nothing about the C code that I wrote which makes the unoptimized assembly a more "one-to-one" mapping of assembly to C so I don't see why gcc should default to loading the address into rax. In any case, it is already not the case that gcc is not optimizing anything since it replaced the call to printf with a call to puts.

Can someone help explain to me why gcc does this? Does it have any advantage or is it simply not worth making the -O0 compilation do something like this? If this is the default behaviour for all function calls (before optimization), what is the reason that gcc loads the first argument into rax, shouldn't loading the first argument directly into rdi be the sensible default?

GCC version - gcc (GCC) 13.1.1 20230714

Platform/OS - Linux 6.4.3-arch1-2 x86_64 GNU/Linux

Compilation command - gcc -O0/1/2/3 -S -masm=intel load.c

What's the point of worrying about what GCC does without optimization? You won't be compiling the code for real use without optimization, will you? Without optimization, the compiler does whatever it first thinks of that will do the job according to the rules it knows. Optimization gives it a chance to think again. IMO, you are worrying about how many angels can dance on the head of a pin, only in computerese. — Jonathan Leffler, Jul 19 '23 at 18:48
I was simply curious. I would like to learn (if possible) what GCC does here just because I think it *may* be an oppurtunity to learn something about GCC/compilation, not because I have any immediate use for it — ZarakshR, Jul 19 '23 at 18:50
`rax` may simply be where GCC starts evaluating any expression. Have `a = b + c*d;`? GCC loads whichever operand it starts working on first into `rax`, the next into `rbx`, and so on, and evaluates the expression using whatever registers it “chooses.” In `printf("HI\n");`, the evaluation of `"Hi\n"` is to load its address, so GCC loads it into `rax`. Then it needs to do the function call, so it moves the value into the register used for passing the first argument. — Eric Postpischil, Jul 19 '23 at 18:59
Is that the actual source code? Note that `printf("HI\n");` is not the same as `puts("HI\n");` and it seems a bit much for non-optimised code to create a another string literal without the newline. — Weather Vane, Jul 19 '23 at 19:07
@WeatherVane: GCC and Clang do that, GCC even with optimization off. — Eric Postpischil, Jul 19 '23 at 19:20
It's purely GCC internals leaking out into the asm because `-O0` does so little optimization that it doesn't remove this redundancy of evaluating an expression into the return-value register and then having to copy it when it's needed somewhere else. — Peter Cordes, Jul 19 '23 at 19:54
You left in the `mov eax, 0` *after* the call in your gcc `-O1` output but not the debug build. That's part of the `return 0`, not the call sequence. (GCC `-O2` or higher enables `-fpeephole2` which will use `xor eax,eax` to zero a register.) — Peter Cordes, Jul 19 '23 at 20:25

Why does GCC redundantly load the address into rax here?

0 Answers0