Swap function potential missed optimization? (gcc)

Asked Sep 03 '22 at 18:15

Active Sep 03 '22 at 18:38

Viewed 65 times

I wrote this swap function in Linux x86-64 assembly.

swap: ; written by me
    mov al, byte [rdi]
    xchg byte [rsi], al
    mov byte [rdi], al
    ret

Out of curiosity, I also compiled the following C code with -O3

void swap(char *a, char *b) {
  char temp = *a;
  *a = *b;
  *b = temp;
}

used objconv on it to get the following assembly.

swap:   ; written by GCC -O3
        endbr64                                         ; 0000 _ F3: 0F 1E. FA
        movzx   eax, byte [rdi]                         ; 0004 _ 0F B6. 07
        movzx   edx, byte [rsi]                         ; 0007 _ 0F B6. 16
        mov     byte [rdi], dl                          ; 000A _ 88. 17
        mov     byte [rsi], al                          ; 000C _ 88. 06
        ret                                             ; 000E _ C3

What it has done also makes sense, and does the job, but my code is shorter. Is this a missed optimization? If so, how's this missed by GCC?

edited Sep 03 '22 at 18:24

asked Sep 03 '22 at 18:15

avighnac

Your code is shorter but slower. – Jester Sep 03 '22 at 18:16
Hmm, interesting. Why is it slower? – avighnac Sep 03 '22 at 18:17
3

Because `xchg` automatically implies a `lock` when used with memory operands. Also, writing `al` might cause false dependencies. – Jester Sep 03 '22 at 18:19
Please elaborate. – avighnac Sep 03 '22 at 18:19
Got it. So basically, **for performance**, `xchg` for two registers is totally ok, but is not ok for a memory operand. – avighnac Sep 03 '22 at 18:36
fewer instructions is not always faster, often there are cases where more is faster...memory accesses vs registers, etc... – old_timer Sep 03 '22 at 22:27

Swap function potential missed optimization? (gcc)

0 Answers0